|
Access:
» Automating the exploitation process on Linux x86Related categories: Linux | Linux | Exploits Stavros LekkasViewed: 10458 | Article date: 2006-04-24 18:00:31 We describe some automation buffer overflow bugs identification methods and compare some techniques. We present a tool which could identify them and produce exploit code would definitely ease the burden.
Inspection of precompiled binaries for flaws is a very painful responsibility for penetration testers. A tool which could identify buffer overflow bugs and produce exploit code would definitely ease the burden.
About the authorStavros Lekkas, originally from Greece, is a 3rd year student of The University of Manchester (formerly known as UMIST). His research interests include cryptography, information security, data mining, higher mathematics (logic and number theory) and computational complexity. He is currently working on a dissertation, which concerns a compiler-related topic Imagine coming across a piece of compiled code without the luck to possess its source code. What is more, it exhibits the typical characteristics of having a buffer overflow vulnerability. Since disassembly analysis is an extremely time-consuming process, a tool which could automate the process of exploiting this potential vulnerability would be very useful. Let's have a look at a possible implementation of such a tool.
What you will learn...
What you should know...
Claiming that a program contains a stack based buffer overflow bug is an indirect implication that there exists a location, the so called buffer, where data is copied. These buffers exist in stack and they are pointed to by addresses. What is more, when data get copied, the bounds are not checked, with the risk of an overflow. After overflowing a buffer, some other segments out of its scope get overwritten as well. Effective manipulation of such segments with valid data leads to control of the execution flow of the program by just using valid pointing addresses. The aforementioned data, which get placed into the buffer, sometimes are of the form of user input. The program can accept user input in many possible ways such as the program arguments (or parameters if you like), environmental variables, switches, even run-time program inputs received using the libc gets(), scanf() functions etc. Since each one of these ways for supplying data has its own story, we will focus on the program arguments as our attack vector. It is very crucial to mention that the automation concept has nothing to do with fuzzy logic and the product tool is not affiliated with fuzzing techniques. Trying to locate specific vulnerabilities by inspecting data generated from deliberate inputs is not fuzzing (see Inset Fuzzing). In our quest to find paths to control the %eip (see Figure 1 for explanation) via arguments, we have to make some reasoning about what we have to face. For example, given a binary executable, it is either vulnerable or not. The first assumption can be translated as either the nth argument is not vulnerable or it is vulnerable, and if so there is a finite distance which must be filled with characters so as to reach the %eip. Tailoring those requirements into predefined ranges of values assists the creation of a cohesive construction model into a finite framework.
FuzzingFuzzing means acting using Fuzzy Logic. Fuzzy Logic theory deals with ambiguity and tries to categorise uncertainty and classify it using mathematics. The set of all integers, in mathematics, has infinite cardinality, and so does the set of all real numbers, etc. Though, when it comes to computers, everything is finite and calculations with really large operands may fail. Program argumentsMany ELF executables receive arguments before starting their execution. A typical example is the rm command, where we have to supply as a parameter what we want to delete. Let's imagine an ELF executable, a.out, that just prints a stream of characters as supplied for argument one. $ ./a.out hakin9 You typed: hakin9 There is a possibility that instead of just calling printf() with argv[1] as parameter, an intermediate buffer, an array of characters, has been declared. Thus, argv[1] is copied into the buffer and printf() uses that buffer as parameter, hopefully with the appropriate format string. There is also a possibility that argv[1] gets copied into that buffer in an unsafe manner. What if we just keep feeding it with larger inputs? $ ./a.out `perl -e ‘print “A” x 50'` You typed: AAAAAAA … AAA Segmentation fault (core dumped) It crashed and it produced a core. Though, many Linux distributions do no produce core files so we can enable this by just typing: $ ulimit -c unlimited This way we allow the production of core files that have unlimited file size. Back to our example, the fact that it produced a core means that, indeed, an intermediate buffer had been used, into which argv[1] had been copied unsafely. By using gdb, the GNU debugger, we can see the instruction that caused the crash. $ ./gdb -c core ./a.out | grep \#0 #0 0x41414141 in ?? () This makes sense because 0x41 is the hexadecimal equivalent of A. Figure 1 gives a more detailed conceptual overview.
Figure 1. Conceptual overview of an unsafe copy operation The instruction pointer got overwritten with an invalid address leading to a crash (see also Article Overflowing the stack on Linux x86 which is available on the hakin9.org website). Instead of supplying it with fifty A's, we could have found the exact distance until the end of %ebp, fill that distance with A's and then supply a valid address. This way we can control the flow of the executed program in a way that it will execute code we can supply. What is more, this can be done automatically. Information gatheringAt this point we shall mention that the information that concerns us about a given executable is the argument number, which gives us a pathway to manipulate the %eip, and the distance until the %eip. At the previous example of a.out, we could start the gdb application for every possible length value of the argument setting every time a buffer payload which increases incrementally. Then we have to inspect the value of the instruction pointer and decide the degree that it has been affected by our inputs. If the executable is indeed vulnerable, we will see three different states during our examination. The following three states will occur in sequence:
Note that a successful partial overwriting corresponds to altering the three out of four bytes of the %eip. The address 0xbfff4141 cannot be assumed as suspect for partial overwrite since it is a valid stack pointing address. The address 0xbf414141 though, is much more suspect because it happens really seldom for the stack to grow that large. Although the final implementation incorporates this issue, it would not be a bad idea to assign weight constant values to indicate how much critical and deliberate the potential overwriting can be. Payload creation algorithm oneThe subsystem that is responsible for creating the payloads does nothing more than to create buffers filled with A's when it is asked to do so. A fairly easy to understand policy to produce such payloads is the famous brute force technique. We will create buffers of all possible lengths, which will be tested one by one until an alternation sign has been made or until we reach the maximum buffer length testing range. If the argument is vulnerable and our test range is in the same range then the deliberate alternation will be definitely spotted.
Figure 2. Flowchart of payload creation algorithm one
|
|
Copyright C 2006 by Software Developer's Journal. All rights reserved.







SDJ Users:
hakin9 StarterKit IT Practical Solutions for Newbies










