21 Jun 2026 15 min read redteam

Hacking C++ (Part 2)

Bypassing CFI

What CFI Is

Control Flow Integrity (CFI) is a security mitigation that protects against control-flow hijacking attacks by checking if function call is valid. Every compiler has its own implementation of CFI (if it has one at all), but the modern and complete version is that of the Clang compiler.

CFI does not check every function call. There are certain cases that CFI handles, such as:

Indirect calls
Virtual calls
Calls via pointers to member functions with an incorrect dynamic type
Calls to non-virtual member functions via an object in which those functions are not defined

Also CFI can protect casts:

clang can prevent casts between objects of unrelated types
clang can prevent casts from an object of a base class to an object of a derived class, if the object is not actually of the derived class
Very specific instance where the default level of base-to-derived cast protection, like in derived_cast, would not catch an illegal cast

To better understand how CFI works, let’s take a look at how CFI protects virtual calls.
(At the time of writing, the documentation page was based on Clang version 23.0.0.)

Virtual Calls Protection

For the call validation, we will need to store information about valid functions that allowed to call. In clang, compiler generate bit vector that maps onto to the region of storage used for those virtual tables. Each set bit in the bit vector corresponds to the address point for a virtual table compatible with the static type for which the bit vector is being built.

Let’s say we have 3 structs:

struct A {
  virtual void f1();
  virtual void f2();
  virtual void f3();
};

struct B : A {
  virtual void f1();
  virtual void f2();
  virtual void f3();
};

struct C : A {
  virtual void f1();
  virtual void f2();
  virtual void f3();
};

The virtual table layout for A, B, and C will look like this:

A::offset-to-top	&A::rtti	&A::f1	&A::f2	&A::f3	B::offset-to-top	&B::rtti	&B::f1	&B::f2	&B::f3	C::offset-to-top	&C::rtti	&C::f1	&C::f2	&C::f3
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14

Using this scheme, we can generate three bit vectors for each structure and use them to construct the final bit vector

Class	2	7	12
A	1	1	1
B	0	1	0
C	0	0	1

Since the other structures are derived from class A, this means that structure A can accept virtual table entry points from structures B and C, while structures B and C can accept only their own virtual table entry points.

Now, to create our final set of bits, we’ll need to use the indices as check bits for each structure, for example:

$$\text{Bits}_A = {2, 7, 12}$$
$$\text{Bits}_B = {7}$$
$$\text{Bits}_C = {12}$$

Then, using ByteArrayBuilder::alocator() function, where Bits is our struct indexes vector and BitSize is our total amount of bits in vector (15 slots in our case, 0->14), we iterate 3 times to generate our bits vector.

BitAllocs is an internal tracking array used by Clang’s ByteArrayBuilder class to keep track of how much space has been used on each of the 8 bit tracks inside the global Bytes array.
AllocByteOffset is a variable that stores the starting index (the byte offset) in the global Bytes array where a specific class hierarchy’s bit vector will begin layout out its data.
AllocMask is mask which will be used in the final cycle of the function

 // Set our bits.
  AllocMask = 1 << Bit;
  for (uint64_t B : Bits)
    Bytes[AllocByteOffset + B] |= AllocMask;

Let’s assume our tracking array BitAllocs starts completely empty: {0, 0, 0, 0, 0, 0, 0, 0}.

Allocation 1: struct A

Inputs: Bits = {2, 7, 12}, BitSize = 15
Track selection: All tracks in BitAllocs are 0. The loop defaults to track Bit = 0.
Offset: AllocByteOffset = BitAllocs[0] -> 0.
Resize: ReqSize = 0 + 15 = 15. The global vector Bytes is resized to 15, zero-initialized.
Update Track: BitAllocs[0] becomes 15.
Mask: AllocMask = 1 << 0 $\rightarrow$ 1 (0x01).
Setting Bits: It loops through {2, 7, 12} and applies Bytes[0 + B] |= 1:

Bytes[2] |= 1; // binary 00000001
Bytes[7] |= 1; // binary 00000001
Bytes[12] |= 1; // binary 00000001

Allocation 2: struct B

Inputs: Bits = {7}, BitSize = 15
Track selection: BitAllocs is currently {15, 0, 0, 0, 0, 0, 0, 0}. The smallest value is 0, so the loop picks track Bit = 1.
Offset: AllocByteOffset = BitAllocs[1] → 0. (It overlaps from the very beginning)
Resize: ``ReqSize = 0 + 15 = 15. Bytes.size()` is already 15, so no resize happens.
Update Track: BitAllocs[1] becomes 15.
Mask: AllocMask = 1 << 1 → 2 (0x02).
Setting Bits: It loops through {7} and applies Bytes[0 + 7] |= 2:

Bytes[7] |= 2; // underlying binary becomes 00000001 | 00000010 = 00000011 (Decimal 3)

Allocation 3: struct C

Inputs: Bits = {12}, BitSize = 15
Track selection: BitAllocs is currently {15, 15, 0, 0, 0, 0, 0, 0}. The smallest value is 0, so the loop picks track Bit = 2.
Offset: AllocByteOffset = BitAllocs[2] → 0. (Still overlaping)
Resize: ReqSize = 0 + 15 = 15. Bytes.size() is already 15, so no resize happens.
Update Track: BitAllocs[2] becomes 15.
Mask: AllocMask = 1 << 2 → 4 (0x04).
Setting Bits: It loops through {12} and applies Bytes[0 + 12] |= 4:

Bytes[12] |= 4; // underlying binary becomes 00000001 | 00000100 = 00000101 (Decimal 5)

So in that way, our final result will be:

char bits[] = { 0, 0, 1, 0, 0, 0, 3, 0, 0, 0, 0, 5, 0, 0 };

Now, to validate the virtual call, clang calculate the slot index of virtual function and compare it with the maximum slot index value.

ca7fbb:       48 8b 0f                mov    (%rdi),%rcx
ca7fbe:       48 8d 15 c3 42 fb 07    lea    0x7fb42c3(%rip),%rdx
ca7fc5:       48 89 c8                mov    %rcx,%rax
ca7fc8:       48 29 d0                sub    %rdx,%rax
ca7fcb:       48 c1 c0 3d             rol    $0x3d,%rax
ca7fcf:       48 3d 7f 01 00 00       cmp    $0x17f,%rax
ca7fd5:       0f 87 36 05 00 00       ja     ca8511
ca7fdb:       48 8d 15 c0 0b f7 06    lea    0x6f70bc0(%rip),%rdx
ca7fe2:       f6 04 10 10             testb  $0x10,(%rax,%rdx,1)
ca7fe6:       0f 84 25 05 00 00       je     ca8511
ca7fec:       ff 91 98 00 00 00       callq  *0x98(%rcx)
  [...]
ca8511:       0f 0b                   ud2

Step 1: Calculate the byte offset

ca7fbb:       48 8b 0f                mov    (%rdi),%rcx
ca7fbe:       48 8d 15 c3 42 fb 07    lea    0x7fb42c3(%rip),%rdx
ca7fc5:       48 89 c8                mov    %rcx,%rax
ca7fc8:       48 29 d0                sub    %rdx,%rax

Step 2: By rotating with 3 bits (same thing as divide by 8), calculate the slot index

ca7fcb:       48 c1 c0 3d             rol    $0x3d,%rax

Step 3: Compare with the maximum slot index value

ca7fcf:       48 3d 7f 01 00 00       cmp    $0x17f,%rax
ca7fd5:       0f 87 36 05 00 00       ja     ca8511

That is the main idea behind the “Forward-Edge CFI for Virtual-Calls”.
Now you can start hunting me and the author of this documentation down, because:

The scheme as described above is the fully general variant of the scheme. Most of the time we are able to apply one or more of the following optimizations to improve binary size or performance.

If you like, you can read more about the optimizations, but to be honest, it doesn’t really matter. It’s enough to simply understand the basic idea of how CFI checks calls.

Types of CFI

CFI have different types, and each type protect different parts of the program.

Category	Description	Examples
Forward-edge CFI	Validates indirect calls/jumps (function pointers, vtable calls)	Clang `-fsanitize=cfi`, Microsoft CFG, Intel IBT
Backward-edge CFI	Validates return addresses	Shadow stacks, Intel CET SHSTK, Clang `-fsanitize=shadow-call-stack`
Hardware-assisted CFI	CPU-level enforcement	Intel CET (IBT + SHSTK), ARM BTI, ARM PAC

Also, CFI have precision level called Granularity.

Category	Description	Examples
Coarse-grained CFI	Restrict the set of indirect call targets to any function that may be indirectly called in the program	kCFI, EMET, CCFIR, TypeArmor
Fine-grained CFI	Restrict each indirect call site to functions that have the same signature as the function to be called	IFCC, VTV, PathArmor, O-CFI

Long story short:

Coarse-grained forward-edge	Coarse-grained backward-edge	Fine-grained forward-edge	Fine-grained backward-edge
Allows jumping to any valid (defined by implementation) function entry point, regardless of type or caller	Allows a return to any member of a set of valid return addresses. (Rare in practice)	Restricts each indirect call site to a small, call-site-specific set of legitimate targets	Allow return only to exact call site within the caller
Of course, there is other ways of for the implementation, but there is no point to discuss all of them. It’s implementation specific anyway.

Bypassing CFI

Below are methods for bypassing Clang’s fine-grained forward-edge CFI.

Each section includes short CTF-style programs (they’re actually simple, just don’t give up).
All CTF programs with the solutions you can find here.

Bypassing CFI with ROP

Return-Oriented Programming (ROP) is a technique based on return address overwriting. I guess most of you already knew it.
Since forward-edge CFI only check the calls, the good old ROP will work ideally.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

void print_flag(unsigned long key)
{
    if (key != 0xC0FFEE1234BEEF99UL) {
        printf("[print_flag] wrong key 0x%lx -- no flag for you.\n", key);
        return;
    }

    puts("\n=================================================");
    puts(" CFI was ENABLED... and you still got here. ");
    char buf[64];
    FILE *f = fopen("flag.txt", "r");
    if (f && fgets(buf, sizeof buf, f)) {
        printf(" FLAG: %s\n", buf);
        fclose(f);
    }
    puts("=================================================\n");
    fflush(stdout);
    _exit(0);
}

typedef void (*handler_t)(void);   /* takes nothing, returns nothing */

static void say_hello(void) { puts("[handler] hello from say_hello()"); }
static void say_bye(void)   { puts("[handler] bye from say_bye()");     }

static handler_t handlers[2] = { say_hello, say_bye };

static void forward_edge_demo(void)
{
    handler_t fp = handlers[0];

    puts("\n[mode 1] Forward-edge (indirect call) attack");
    puts("We will call a handler through a function pointer.");
    fflush(stdout);

    puts("[*] Repointing the handler at print_flag() (wrong type!)...");
    fp = (handler_t)(void *)print_flag;      /* type-confused pointer  */

    puts("[*] Performing the indirect call now:");
    fflush(stdout);
    fp();
    puts("[*] (If you see this line, CFI did NOT stop the call.)");
}

static void backward_edge_demo(void)
{
    char buf[64];

    puts("\n[mode 2] Backward-edge (return address) attack");
    puts("Send me your input. read() has no idea how big buf is.");
    printf("> ");
    fflush(stdout);

    read(0, buf, 512);

    printf("[*] You said: %s\n", buf);
    puts("[*] Returning now (where to?) ...");
    fflush(stdout);
}

static void menu(void)
{
    puts("\n--- CFI vs ROP demo ---");
    puts(" 1) forward-edge attack  (CFI should block this)");
    puts(" 2) backward-edge attack (CFI cannot see this)");
    puts(" q) quit");
    printf("choice> ");
    fflush(stdout);
}

int main(void)
{
    setvbuf(stdout, NULL, _IONBF, 0);

    printf("[i] print_flag is at %p (you may need this)\n",
           (void *)print_flag);

    char line[16];
    for (;;) {
        menu();
        if (!fgets(line, sizeof line, stdin)) break;
        switch (line[0]) {
            case '1': forward_edge_demo();  break;
            case '2': backward_edge_demo(); return 0;
            case 'q': return 0;
            default:  puts("?"); break;
        }
    }
    return 0;
}

Compile with: -O1 -g -flto -fvisibility=hidden -fsanitize=cfi -fno-sanitize-trap=cfi -fno-stack-protector -no-pie -fno-pie -Wall -Wno-unused

Bypassing CFI with DOP

Data-oriented programming (DOP) is a technique based entirely on data manipulation. Thus, instead of directly hijacking the flow, you manipulate the data that controls that flow itself.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>

static char secret_flag[64];

void print_flag_fnptr(uint64_t key)
{
    if (key == 0xD09D09D09D09D09DUL)
        printf(" FLAG: %s\n", secret_flag);
    else
        puts("[print_flag_fnptr] reached, but this should never run under CFI.");
    fflush(stdout);
    _exit(0);
}

typedef void (*renderer_t)(const char *label);

static void render_plain(const char *label) { printf("[vm] label = \"%s\"\n", label); }
static void render_loud (const char *label) { printf("[VM] LABEL = \"%s\"!!\n", label); }

static renderer_t renderers[2] = { render_plain, render_loud };

struct vm {
    char       label[32];
    uint64_t  *ptr;
    renderer_t render;
    uint64_t   acc;
    uint64_t   cells[8];
};

static void op_name(struct vm *vm)
{
    printf("[name] send raw bytes for the label (overflows past 32):\n> ");
    fflush(stdout);
    read(0, vm->label, 256);          /* 256 bytes into a 32-byte buffer */
}

static void op_load(struct vm *vm, long i)
{
    if (i < 0 || i > 7) { puts("[load] cell out of range (0..7)"); return; }
    vm->ptr = &vm->cells[i];
    printf("[load] ptr -> cells[%ld]\n", i);
}

static void op_peek(struct vm *vm)
{
    printf("[peek] *ptr = 0x%016lx\n", *vm->ptr);
}

static void op_emit(struct vm *vm)
{
    uint64_t word = *vm->ptr;
    fwrite(&word, 1, sizeof word, stdout);
    fflush(stdout);
}

static void op_next(struct vm *vm)
{
    vm->ptr++;
}

static void op_render(struct vm *vm)
{
    puts("[render] calling vm.render(label) -- this is an indirect call:");
    fflush(stdout);
    vm->render(vm->label);            /* <-- clang CFI checks target type */
}

static void print_help(void)
{
    puts("opcodes:");
    puts("  name        read raw bytes into the label buffer");
    puts("  load <i>    point ptr at cells[i]   (i = 0..7)");
    puts("  peek        print *ptr as a 64-bit hex word");
    puts("  emit        write the 8 bytes at *ptr to stdout");
    puts("  next        advance ptr by one 64-bit word (ptr++)");
    puts("  render      render the label via vm.render(label)");
    puts("  help        show this list again");
    puts("  quit        leave the VM");
}

static void vm_run(void)
{
    struct vm vm;
    memset(&vm, 0, sizeof vm);
    vm.ptr    = &vm.cells[0];         /* safe default: points inside cells */
    vm.render = renderers[0];         /* safe default: a type-correct cb   */

    char line[64];
    for (;;) {
        printf("\nvm> ");
        fflush(stdout);
        if (!fgets(line, sizeof line, stdin)) break;

        if (!strncmp(line, "name", 4))        op_name(&vm);
        else if (!strncmp(line, "load", 4))   op_load(&vm, strtol(line + 4, NULL, 0));
        else if (!strncmp(line, "peek", 4))   op_peek(&vm);
        else if (!strncmp(line, "emit", 4))   op_emit(&vm);
        else if (!strncmp(line, "next", 4))   op_next(&vm);
        else if (!strncmp(line, "render", 6)) op_render(&vm);
        else if (!strncmp(line, "help", 4))   print_help();
        else if (!strncmp(line, "quit", 4))   break;
        else { puts("unknown opcode."); print_help(); }
    }
}

int main(void)
{
    setvbuf(stdout, NULL, _IONBF, 0);

    FILE *f = fopen("flag.txt", "r");
    if (!f)
	    return 1;
    if (f)
    {
	    fgets(secret_flag, sizeof secret_flag, f)
	    fclose(f);
	}

    printf("[i] secret_flag is at %p\n", (void *)secret_flag);
    printf("[i] (ignore me) print_flag_fnptr is at %p\n", (void *)print_flag_fnptr);

    puts("\n--- tiny config VM ---");
    puts("Intended use: 'load <i>' to point at a cell, then 'peek'/'emit'.");
    puts("The VM should only ever touch its own cells[8]...\n");
    print_help();

    vm_run();
    return 0;
}

Compile with: -O1 -g -flto -fvisibility=hidden -fsanitize=cfi -fno-sanitize-trap=cfi -fno-stack-protector -no-pie -fno-pie -Wall -Wno-unused

Bypassing CFI with COOP

Counterfeit object-oriented programming is a technique based on forging fake objects with the existing virtual pointers.
So instead of reusing the functions, you reuse virtual tables (via virtual pointers).

#include <cstdio>
#include <cstdint>
#include <unistd.h>

static long g_latch = 0;

struct Greeter {
    virtual void hello() { puts("[Greeter] hi"); }
};
struct Polite : Greeter {                // the normal, expected subclass
    void hello() override { puts("[Polite] nice to meet you!"); }
};
struct Unlock : Greeter {
    void hello() override { g_latch = 0xC0FFEE; puts("[Unlock] *click* latch open"); }
};
struct Reveal : Greeter {
    void hello() override {
        if (g_latch != 0xC0FFEE) { puts("[Reveal] still locked"); return; }
        char flag[64]; FILE *f = fopen("flag.txt", "r");
        if (f && fgets(flag, sizeof flag, f))
        {
            printf(" FLAG: %s", flag);
            fclose(f);
        }
    }
};

alignas(16) static unsigned char g_pool[256];

int main() {
    setvbuf(stdout, nullptr, _IONBF, 0);

    Unlock u_sample; Reveal r_sample;
    printf("[i] g_pool       = %p   (forge your fake objects here)\n", (void*)g_pool);
    printf("[i] Unlock vtable= %p\n", *(void**)&u_sample);
    printf("[i] Reveal vtable= %p\n", *(void**)&r_sample);
    printf("[i] each fake object is 8 bytes: just a vtable pointer.\n\n");

    unsigned char count = 0;
    printf("How many widgets to render? ");
    if (read(0, &count, 1) != 1) return 0;
    if (count > 32) count = 32;

    printf("Send %u fake objects (%u bytes):\n", count, count * 8);
    read(0, g_pool, (size_t)count * 8);

    for (unsigned i = 0; i < count; i++) {
        Greeter *obj = reinterpret_cast<Greeter *>(g_pool + i * 8);
        printf("[render %u] vptr=%p -> ", i, *(void**)obj);
        fflush(stdout);
        obj->hello();
    }
    return 0;
}

Compile with: -std=c++17 -O1 -g -flto -fvisibility=hidden -fsanitize=cfi -fno-sanitize-trap=cfi -fno-stack-protector -no-pie -fno-pie -Wall -Wno-unused

Bypassing CFI with CHOP

Catch Handler Oriented Programming (CHOP) abuses the C++ exception path, which clang CFI does not instrument.
When you throw, the compiler emits a call to the runtime:
__cxa_throw(object, type_info*, destructor)
The C++ runtime (libstdc++/libgcc unwinder) then walks the stack and picks which catch block runs by MATCHING the thrown type_info* against each handler’s type_info*. That handler selection happens inside the runtime: there is no CFI-checked indirect call at the throw site, and the chosen catch block is entered via the personality routine, not a call/ret that CFI watches.

So if an attacker controls the type_info* passed to __cxa_throw, they choose which catch handler runs.

#include <cstdio>
#include <cstring>
#include <cstddef>
#include <typeinfo>
#include <unistd.h>

static long g_latch = 0;

struct BenignError {};
struct UnlockError {};
struct RevealError {};

extern "C" {
    void* __cxa_allocate_exception(unsigned long);
    void  __cxa_throw(void*, std::type_info*, void(*)(void*));
}

struct Request {
    char             buf[24];
    std::type_info  *error_ti;
};

static void process(Request *r) {
    if (r->buf[0] == 'O' && r->buf[1] == 'K') { puts("  [process] request OK"); return; }
    puts("  [process] invalid -> throwing error (runtime selects the catch)...");
    void *exc = __cxa_allocate_exception(8);
    __cxa_throw(exc, r->error_ti, nullptr);
}

int main() {
    setvbuf(stdout, nullptr, _IONBF, 0);

    printf("[i] ti(BenignError) = %p\n", (void*)&typeid(BenignError));
    printf("[i] ti(UnlockError) = %p   <- catch flips the latch\n", (void*)&typeid(UnlockError));
    printf("[i] ti(RevealError) = %p   <- catch prints the flag\n", (void*)&typeid(RevealError));
    printf("[i] sizeof(Request)=%zu, error_ti at offset %zu (right after buf[24]).\n\n",
           sizeof(Request), offsetof(Request, error_ti));

    unsigned char n = 0;
    printf("How many requests? ");
    if (read(0, &n, 1) != 1) return 0;
    if (n > 16) n = 16;

    Request q[16];
    printf("Send %u request records (%zu bytes each: buf[24] + 8-byte error_ti):\n",
           n, sizeof(Request));
    for (unsigned i = 0; i < n; i++)
        read(0, &q[i], sizeof(Request));

    for (unsigned i = 0; i < n; i++) {
        printf("[dispatch %u]\n", i);
        try {
            process(&q[i]);
        }
        catch (BenignError&) {
            puts("  [catch BenignError] logged. nothing to see here.");
        }
        catch (UnlockError&) {
            g_latch = 0xC0FFEE;
            puts("  [catch UnlockError] *click* latch open");
        }
        catch (RevealError&) {
            if (g_latch == 0xC0FFEE) {
                char flag[64]; FILE *f = fopen("flag.txt", "r");
                if (f && fgets(flag, sizeof flag, f)) printf("  FLAG: %s", flag);
                else puts("  FLAG: ctf{ch0p_picks_the_catch_handler_past_cfi}");
                if (f) fclose(f);
            } else {
                puts("  [catch RevealError] vault still sealed");
            }
        }
    }
    return 0;
}

Compile with: -std=c++17 -O1 -g -flto -fvisibility=hidden -fsanitize=cfi -fno-sanitize-trap=cfi -fno-stack-protector -no-pie -fno-pie -Wall -Wno-unused -Wno-invalid-offsetof

Bypassing CFI with CFOP

Coroutine Frame-Oriented Programming (CFOP) abuses how C++20 coroutines are resumed, which clang CFI does not guard.

When you create a coroutine, the compiler heap-allocates a “coroutine frame”. The FIRST TWO POINTERS of that frame are function pointers:
frame[0] = resume function
frame[1] = destroy function
handle.resume() lowers to a compiler intrinsic that loads frame[0] and calls it. That dispatch is NOT a normal typed indirect call, so clang’s cfi-icall does not instrument it.

So if an attacker controls a coroutine frame (or forges a fake one and points a handle at it), handle.resume() calls whatever sits in slot 0. Forging several fake frames and resuming them through the scheduler loop chains “frame gadgets”.

#include <cstdio>
#include <cstdint>
#include <coroutine>
#include <unistd.h>

static long g_latch = 0;

static void unlock_gadget(void*) {
    g_latch = 0xC0FFEE;
    puts("  [unlock] *click* latch open");
}
static void reveal_gadget(void*) {
    if (g_latch == 0xC0FFEE) {
        char flag[64]; FILE *f = fopen("flag.txt", "r");
        if (f) {
            fgets(flag, sizeof flag, f);
            printf("  FLAG: %s", flag);
            fclose(f);
        }
    } else {
        puts("  [reveal] vault still sealed");
    }
}

struct Job {
    struct promise_type {
        Job get_return_object() {
            return Job{std::coroutine_handle<promise_type>::from_promise(*this)};
        }
        std::suspend_always initial_suspend() noexcept { return {}; }
        std::suspend_always final_suspend()   noexcept { return {}; }
        void return_void() {}
        void unhandled_exception() {}
    };
    std::coroutine_handle<promise_type> h;
};

static Job real_job() {                   // a normal, benign coroutine
    puts("  [job] doing legitimate work...");
    co_await std::suspend_always{};
    puts("  [job] ...resumed and finished");
}

static void scheduler_resume(std::coroutine_handle<> h) {
    asm volatile("" :: "r"(&h) : "memory");
    h.resume();                           // <-- loads frame[0], calls it (no CFI)
}

alignas(16) static uint64_t g_pool[256];

int main() {
    setvbuf(stdout, nullptr, _IONBF, 0);

    Job j = real_job();
    uint64_t *frame = (uint64_t*)j.h.address();
    printf("[i] a real coroutine frame @ %p\n", (void*)frame);                                                                          printf("[i]   frame[0] (resume ptr)  = %p\n", (void*)frame[0]);
    printf("[i]   frame[1] (destroy ptr) = %p\n", (void*)frame[1]);
    printf("[i] g_pool         = %p   (forge fake frames here)\n", (void*)g_pool);
    printf("[i] unlock_gadget  = %p\n", (void*)unlock_gadget);
    printf("[i] reveal_gadget  = %p\n", (void*)reveal_gadget);
    printf("[i] a forged frame is 16 bytes: [ resume ptr ][ destroy ptr ].\n\n");
    j.h.destroy();

    unsigned char n = 0;
    printf("How many jobs to schedule? ");
    if (read(0, &n, 1) != 1) return 0;
    if (n > 16) n = 16;

    printf("Send %u job frames (%u bytes: 16 each):\n", n, n * 16u);
    read(0, g_pool, (size_t)n * 16);

    for (unsigned i = 0; i < n; i++) {
        void *frame_addr = (void*)(g_pool + i * 2);     // 2 * 8 bytes = 16                                                                 printf("[schedule %u] frame=%p resume_ptr=%p ->\n",
               i, frame_addr, (void*)g_pool[i * 2]);
        fflush(stdout);
        auto handle = std::coroutine_handle<>::from_address(frame_addr);
        scheduler_resume(handle);                       // <-- CFOP fires here
    }
    return 0;
}

Compile with: -std=c++20 -O1 -g -flto -fvisibility=hidden -fsanitize=cfi -fno-sanitize-trap=cfi -fno-stack-protector -no-pie -fno-pie -Wall -Wno-unused

Real cases of CFI bypass

References

Original post by Magnus, from the 0x00sec forum.

Bypassing CFI

What CFI Is

Virtual Calls Protection

Types of CFI

Bypassing CFI

Bypassing CFI with ROP

Bypassing CFI with DOP

Bypassing CFI with COOP

Bypassing CFI with CHOP

Bypassing CFI with CFOP

Real cases of CFI bypass

References

You might also like...