In this article, I’ll describe how to hunt for rootkits in linux, Rootkits are extremely advanced pieces of code, not any one can write it, However there’s a lot of Proof of concept code demonstrating rootkit techniques and how to build one from scratch, In the article I will present a technique based on instructions in some system calls, which can be used to detect rootkits.
This article I’ll explain techniques not only how to detect rootkits but also how to conduct a forensics and hunt down malware, I’ll focus only to cover the basics at first and understand the nature of the malicious behavior. For those interested in learning about kernel rootkits, I have a separate post titled “Writing a Simple Rootkit for Linux” and “The linux kernel modules programming” which I recommend reading before delving into this topic.
The kernel is responsible for handling a lot of the user’s system’s functionality whether that be browsing local files or using a web browser to browse the Internet. This is done through the implementation of system calls - low-level functions that run in a kernel context.
Rootkits can either be in user-land or kernel-land, User-land refers to privilege ring 3, while kernel-land refers to privilege ring 0, In simple term “In order to stay invisible backdoors modify kernel structures and code, causing that nobody can trust the kernel. Nobody”
levels: ring0(the most powerful privilege level) ring1 ring2 ring3(the least powerful privilege level)
Dynamic Kernel load kernel modules to give admin the ability to load drivers and other code at runtime, and to remove the need to recompile the kernel and reboot. Kernel
rootkits typically leverage this to run code directly in kernel space.
Usually what happen is that the code is designed to be loaded as a kernel module. This enables the
rootkit to exploit the dynamic kernel module loading feature, which allows it to inject its malicious code directly into the running kernel without the need to recompile the entire kernel or reboot the system, Let’s take a look at Linux Kernel Module (LKM) Rootkit.
Many rootkits modifies syscall table in order to redirect some useful system calls like sys_read(), sys_write(), sys_getdents(), etc…, Let’s say we have
rootkit.ko to be able to grants the rootkit access to the kernel’s functions and data structures, first we must Locate the
/* Code snippet from hijack_execve() function */ syscall_table = (void *)kallsyms_lookup_name("sys_call_table"); real_execve = (void *)syscall_table[__NR_execve]; syscall_table[__NR_execve] = &new_execve;
in this function we uses
kallsyms_lookup_name() to locate the address of the
sys_call_table. Next After locating the
rootkit replaces the address of the original
execve() system call with the address of its own
new_execve() function. This effectively hooks the
execve() system call, and whenever any process calls
execve(), the rootkit’s
new_execve() function will be executed instead.
rootkit is compiled, it can be loaded into the kernel, Once the
rootkit module is loaded, its code is executed directly in the kernel space with full privileges
In this example, the
rootkit sets up a kernel thread in the
start_cmd_thread() function, which repeatedly executes the shell script (
/tmp/rootkit.sh) every 10 seconds.
/* Code snippet from start_cmd_thread() function */ my_kthread = kthread_create(threadfn, &cpu, "rootkit"); kthread_bind(my_kthread, cpu); wake_up_process(my_kthread);
The newly created kernel thread is then bound to a specific CPU core using
kthread_bind() to ensure that it runs on a specific processor.
call_usermodehelper(CMD, NULL, NULL, UMH_NO_WAIT); msleep(10000);
Since the kernel thread is running in kernel space and has been started during module initialization, it will continue executing after the
rootkit is loaded, even if the system is rebooted. This ensures the
rootkit's persistence across system reboots.
Now That we have some understanding into how kernel
rootkits Work’s, Let’s Move on to the next phase.
So, How to detect kernel based rootkits, The main problem is we can not trust the kernel, However we can get some reliable information form it. meaning we want to use kernel to help us detect malicious code which has the full control of our kernel. The question is which kernel functions should we check?
Let’s back up a little bit, what is the main task of every
rootkit? Well, its job is to hide presence of attacker’s
process,files,connections in the system. And those things should be hidden from such tools like
netstat etc. These programs collect the system information through some well known system calls.
Even if the malware does not touch
syscall directly it modifies some kernel functions which are activated by one of the system call. The problem lies in the fact, that these modified functions does not have to be executed during every system call. For example if we modify only some pointer to reading functions in
procfs, then attacker’s code will be executed only when
read() is called in order to read some specific file, like
This is what make rootkit challenging to detect, one of the old method in fact not old it’s just modern rootkits are more advanced using techniques like
evasion, I’ll get in to this later, The heuristic approach basically we have to measure execution time of particular system call with different arguments. For example we test
sys_read() by reading
"/proc/net/" (i.e. reading regular file, device and pseudo proc-file).
These measurements are intended to identify anomalies that may indicate the presence of a
rootkit, Instead of testing all 230 system calls the method focuses on a subset of system calls that are commonly used by rootkits to perform specific tasks. For example, they mention testing the
sys_read() call with different file paths
This method is based on the assumption that typical
rootkit tasks, such as hiding processes or files, may involve only a limited set of system calls, and
Rootkits may manipulate or hide information in the
procfs to conceal their activities, The
"/proc" directory provides a pseudo-file system that exposes various kernel data structures and system information.
How this approach work’s?
First we define the subset of sys, next we measure the execution time of each system call with its chosen arguments, In a normal, non-compromised system, the execution time for the same system call with different arguments should not vary significantly. However, if a
rootkit is present and actively manipulating the system, there may be inconsistencies or delays in the execution time of certain system calls.
Let’s take a look at this snippet:
/* Measure sys_read() execution time */ time1 = measure_sys_read(FILE_1); // etc/passwd time2 = measure_sys_read(FILE_2); // proc/net/*
we opens the file, reads some data, and then calculates the execution time. After measuring the execution times for the two files, we compare them to detect any significant differences, However, There’s one problem with this method which is false positives. Linux kernel is a complex program, and most of the system calls have many if-then clauses which means different patch are executed depending on many factors. it can be anything so this approach often requires a good knowledge of the system and experiences to be able to preform more analysis and this goes on all the tools out there like
chkrootkit While they implement more advanced techniques than simple execution time measurement, they are not immune to false positives or false negatives. The effectiveness of these tools depends on various factors, including the accuracy of the heuristics used, the timeliness of their signature databases, and the expertise of the user.
Next, I will show a case forensic analysis using the Volatility memory analysis framework to investigate a Linux
rootkit for simulating malicious activities. After a system has been compromised, it becomes crucial to extract forensically-relevant information. RAM, being volatile, clears its memory each time the computer is restarted. meaning, if a hacked computer is restarted, a significant amount of information that could reveal how the system was initially compromised will be lost. To address this issue, Volatility comes into play as a valuable tool capable of analyzing the volatile memory of a system.
Typically, one would commence by extracting system information and gathering data about the operating system (OS) and its base configuration. However, it is important to note that this is not a Volatility tutorial, and the rootkit does not leave any traces for identifying the infection in this memory image. Therefore, we must conduct a more extensive investigation, beginning with scrutinizing process listings and conducting in-depth analysis.
Check which processes were running on the system when you took the memory dump using the linux_psaux plugin, you can find plugins by
$ python3 vol.py --info | grep linux_
This plugin is used to provide a full process listing of the system. Its output is approximately the same as would be obtained running the
ps -aux command via a terminal, The resulting output shown Everything in this list of processes appeared to be normal, with the exception of one process, specifically the very last process in the list. Its name of F00 is not a known standard Linux
Next we use linux_pslist This plugin prints the list of active processes starting from the
init_task symbol and walking the
task_struct->tasks linked list. It does not display the swapper process. If the DTB column is blank, the item is likely a kernel thread. Result the same numbers of processes were found using this plugin as with the previous plugin. Again,
the only process that did not appear to belong was F00, Next is dump the files for further analysis we utilize linux_lsof plugin which mimics the
lsof command on a live system. It prints the list of open file descriptors and their paths for each running process
Pid FD Path -------- -------- ---- 1 0 /dev/null 1 1 /dev/null 1 2 /dev/null
After we list the files and detect what we looking for by tracking down the suspicious process it’s time prints details of process memory, including heaps, stacks, and shared libraries linux_proc_maps This very powerful plugin can be used to learn important information about the underlying system as a whole
0x8050000-0x8051000 r-x 2777 /usr/F_00/F_00 0x8051000-0x8052000 rw- 4096 2777 /usr/F_00/F_00 0xb75d7000-0xb75d8000 rw- 0 0
What this plugin reveals about this process is the actual location of the files associated with
suspicious process F_00 specifically its actual location,
/usr/ also it is revealed by its permission of r-x. Interestingly, this process only relies on two libraries, whereas most system processes rely on many additional libraries, Finally it’s time time dump this process directly from the memory image, to do that let’s call linux_find_file This particular plugin can be used to not only dump pre-identified files from the memory image (using information obtained from other plugins) but it can also list all filesystem objects with an open handle in memory we can simply dump any target file with the argument
-F for example:
$ python3 vol.py ... linux_find_file -F “/usr/F_00/F_00” Inode Number Inode -------------------- --------------- 0161170 0xf $ python3 vol.py ... linux_find_file -i 0xf -O mal.elf
The “Inode” represents the location in memory where this specific “inode” is stored. With this information, the file “mal.elf” was generated and dumped from the memory image. The next step is to verify the hash of the file and check if there is a match in any malware database or perform further analysis on the binary.
Of course, In the real world situation it won’t be this easy but hopefully you learned something about Volatility memory plugins and the tool itself and the possibility to not only identify it but also dump a suspect file, Additionally, it’s important to note that rootkits may create concealed network connections, a topic we can explore in a separate post focused on Analyzing Network Traffic.
Forensics is a fascinating field that delves into malware analysis and operating system internals. In this explanation, I aimed to simplify complex concepts and keep the post brief. However, certain aspects may still be unclear, so I provided reference links to various articles covering the fundamentals and tutorials on tools used in this context. Remember, this is just the 101 stuff there’s more to learn and I encourage to do more research.