Rewriting LibC functions in malwares

BlackYenii · March 24, 2018, 1:07pm

Hello everyone;

I wonder why some malware dev, rewrite some libc functions in their malware, like strcat, strcpy, malloc …

Thanks !!

oaktree · March 24, 2018, 1:20pm

Where have you seen this? Mentioning some examples would be great.

BlackYenii · March 24, 2018, 2:00pm

Carberp :

github.com

nyx0/Carberp/blob/master/Source/Strings.cpp

#include <windows.h>

#include "Memory.h"
#include "GetApi.h"
#include "Strings.h"
#include "BotClasses.h"


#include "StrWildCmp.cpp"

DWORD WINAPI m_lstrncmp( const char *szstr1, const char *szstr2, int nlen )
{
	if ( !szstr1 || !szstr2 )
		return -1;

	DWORD dwReturn;

	__asm
	{
		pushad

This file has been truncated. show original

github.com

nyx0/Carberp/blob/master/Source/Memory.cpp

#include <windows.h>

#include "Memory.h"
#include "GetApi.h"
#include "Utils.h"
#include "ntdll.h"

//#include "BotDebug.h"

void *m_memset( void *szBuffer, DWORD dwSym, DWORD dwLen )
{
	if ( !szBuffer )
		return NULL;

	__asm
	{
		pushad
		mov		edi,[szBuffer]
		mov		ecx,[dwLen]
		mov		eax,[dwSym]

This file has been truncated. show original

A virus from Github

github.com

elfmaster/skeksi_virus/blob/master/virus.c

/*
 * Skeksi Virus v0.1 - infects files that are ELF_X86_64 Linux ET_EXEC's
 * Written by ElfMaster - [email protected]
 *
 * Compile:
 * gcc -g -O0 -DANTIDEBUG -DINFECT_PLTGOT  -fno-stack-protector -c virus.c -fpic -o virus.o
 * gcc -N -fno-stack-protector -nostdlib virus.o -o virus
 *
 * Using -DDEBUG will allow Virus to print debug output
 *
 * Usage:
 * ./virus
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>

This file has been truncated. show original

Malware_Info · March 24, 2018, 2:30pm

May be b’coz malware analysts heavily rely on FLIRT signatures,a feature provided by IDA,which actually finds libc functions in the assembly code and renames it with the library name.
https://www.hex-rays.com/products/ida/tech/flirt/in_depth.shtml
IDA relies on some tactics to find this libc functions,so defeat this techniques they should have written libc functions by hand.
If i’m wrong plz correct me!

BlackYenii · March 24, 2018, 9:09pm

First thanks for replying, but whats is the problem if IDA detect LibC functions ? They are not malicious …

BlackYenii · March 24, 2018, 9:10pm

Hope to have some clarifications from @dtm

dtm · March 25, 2018, 1:30am

The only reasons I can think of for implementing your own functions would be the following:

Evading signature-based or heuristic detection.

Let’s say I have a keylogger and I want it to upload logs to my FTP server. If I directly link the executable such that functions like FtpOpenFile and FtpPutFile exist in the imports in plain text and can be seen to be used in the code section, it would be very suspicious. If I can replace them by implementing my own self-coded networking library that has an FTP protocol, it could reduce the chance of it being suspicious. This is probably not the reason for why the above samples implement their own libc functions because they wouldn’t be classified as potentially unwanted.

Reducing dependencies in the binary.

Sometimes, dependencies are troublesome for malware that infects other objects. In the case of viruses, having as little dependencies is ideal because the environment of the host is considered to be volatile. What I mean by this is that if you select a host at random, there is no guarantee that the execution environment will contain what the virus may need to properly do its task, therefore, to eliminate dependency is to become entirely self-sufficient.

A very classical example of this is dynamically obtaining WinAPI functions by walking the process’ PEB’s executable modules under the assumption that all processes must have the ntdll.dll and kernel32.dll libraries present. Because of this, any malcode that lives in the process space of a host has access to whatever it needs to interact with the OS without having to rely on what already exists in the host’s environment.

Other than that, your guess is as good as mine. ¯\_(ツ)_/¯

0x00pf · March 25, 2018, 7:43am

On top of @dtm points I’d add:

Your binary will be way smaller which is, in general, a desirable feature for a malware

I think point 2 is a very good reason

BlackYenii · March 25, 2018, 10:06pm

Thanks @dtm and @0x00pf … does it have any relation with having a position independent in memory?

dtm · March 25, 2018, 11:46pm

In the Windows environment, using functions from libc (e.g. memcpy, strcpy) requires the msvcrt.dll library to be present in the process space which means that the binary must be linked with it in its import table assuming the source contains its direct usage (as opposed to dynamically retrieving it as aforementioned or by using the GetProcAddress/LoadLibrary pair but using these two functions also requires linking). The import table by itself is not position-independent but can be pseudo-independent with the reliance on another table structure known as the relocation table.

Depending on these two structures needs more information to be able to locate them. In a standard PE binary, the PE headers are responsible for describing the structure of the program which includes the import and relocation tables. To programmatically calculate these offsets requires the use of such data structure which only introduces more size and as @0x00pf mentioned, smaller sizes are ideal for viruses. To programmatically fix the offsets of the import table such that it is “position-independent” requires more code which in turn expands the size.

So as you can see, it’s just much simpler and effective to implement these functions yourself. The trade-off clearly isn’t worth it with libc. Shellcode is the way to go.

0x00pf · March 26, 2018, 5:50am

It indeed makes things easier, if your program will be moving around in its own addressing space (or in another processes addressing space) and not just standing at the entry point specified in its ELF header (for GNU/Linux systems), then it is very handy to do not depend on anything external.

To better understand why, we need to roughly known a libc C function is invoked:

Call an entry on a memory table (PLT Procedure Linkage table)
At the first invocation, the dynamic linker is fired to resolve the address of the function you want to call in the library (and even load the library if needed)
Then, the entry in that memory table is patched with the resolved address so, next time, it does not need to call the dynamic linker (you already know where the function is located).

So, even in the stationary case (whenever all the symbols your program uses have been resolved), all your calls to a libc function goes through an indirection. For the gory details check theses two great post at 0x00sec

So, coming back to your question. Once the PLT table is resolved/updated, you can just call any libc function from any memory position, through that table. In order to make your program Position Independent you need to let the compiler know, independently of the libraries you use. So, I do not see a big advantage on that side.

However, if you want your code to be loaded using non standard techniques (as it may happen with a virus or other malwares), having all your program self-contained makes things easier as you do not have to care about linking or resolving symbols… on the other hand, whatever functionality you need, you will have to implement your self/add to your program manually… Which, depending on what the program has to do may be quite some work…

BlackYenii · March 26, 2018, 10:21am

@dtm @0x00pf Thank youu very much
I understand ^^

system · April 23, 2018, 1:07pm

This topic was automatically closed after 30 days. New replies are no longer allowed.