Binary Security

Binary security studies vulnerabilities arising from low-level memory management in compiled programs, along with exploitation techniques and modern hardware/software mitigations. These vulnerabilities are primarily found in C and C++ programs; Rust's ownership model prevents most of them at compile time.

Buffer Overflows

Stack Buffer Overflow

Writing past the end of a stack-allocated buffer overwrites adjacent data, including the saved return address.

Memory layout (stack grows down):

High addresses
┌─────────────────┐
│ Return address   │ ← Overwritten to redirect control flow
├─────────────────┤
│ Saved EBP/RBP   │ ← Frame pointer
├─────────────────┤
│ Local variables  │
├─────────────────┤
│ char buf[64]     │ ← Buffer starts here, overflow goes UP
└─────────────────┘
Low addresses

// Classic vulnerable function
void vulnerable(char *input) {
    char buffer[64];
    strcpy(buffer, input);  // No bounds checking — overflow if input > 64 bytes
}

Exploitation: Attacker supplies input longer than the buffer, overwriting the return address with the address of injected shellcode or a ROP gadget chain.

Heap Buffer Overflow

Overflowing a heap-allocated buffer corrupts adjacent heap metadata or other heap objects.

Heap layout:
┌──────────┬──────────┬──────────┬──────────┐
│ Chunk A  │ Metadata │ Chunk B  │ Metadata │
│ (data)   │ (size,   │ (data)   │ (size,   │
│          │  fd, bk) │          │  fd, bk) │
└──────────┴──────────┴──────────┴──────────┘
           ↑ Corrupting metadata enables write-what-where

Exploitation techniques: Corrupt heap metadata to gain arbitrary write primitives during free() or malloc() operations. Overwrite function pointers, vtable pointers, or GOT entries.

Format String Vulnerabilities

Passing user-controlled input as a format string to printf-family functions allows reading and writing arbitrary memory.

// Vulnerable
printf(user_input);         // If user_input = "%x %x %x" → leaks stack values
                            // If user_input = "%n" → writes to memory

// Safe
printf("%s", user_input);   // Format string is constant

%n writes the number of bytes printed so far to the address pointed to by the next argument on the stack. Combined with address control, this enables arbitrary memory writes.

Use-After-Free

Accessing memory after it has been freed. If the freed memory is reallocated to a different object, the dangling pointer now references attacker-controlled data.

struct User {
    void (*handler)(void);  // Function pointer
    char name[32];
};

struct User *user = malloc(sizeof(struct User));
user->handler = safe_handler;
free(user);                 // user is freed

// Attacker causes allocation of same-size object
struct Evil *evil = malloc(sizeof(struct Evil));  // May reuse user's memory
evil->fake_handler = malicious_code;

user->handler();            // Calls attacker's function through dangling pointer

Rust prevents this at compile time: the borrow checker ensures references cannot outlive the data they point to.

// This would not compile -- borrow checker prevents dangling references
PROCEDURE DANGLING() → reference to string
    s ← CREATE_STRING("hello")
    RETURN REFERENCE_TO(s)     // s is destroyed here; reference would dangle
    // Compiler error: cannot return reference to local variable

Integer Overflow

Arithmetic operations that exceed the type's range wrap around or produce unexpected values.

// Vulnerable allocation
size_t count = user_controlled;     // e.g., 0x40000001
size_t size = count * sizeof(int);  // Overflows to 0x4 (on 32-bit)
char *buf = malloc(size);           // Allocates 4 bytes
for (int i = 0; i < count; i++)
    buf[i] = data[i];              // Massive heap overflow

Rust panics on overflow in debug builds and wraps in release builds. Use explicit wrapping/checked/saturating operations:

a ← 4000000000 (unsigned 32-bit)
b ← 1000000000 (unsigned 32-bit)

// These are explicit about overflow behavior
wrapped ← WRAPPING_ADD(a, b)      // Wraps: 705032704
checked ← CHECKED_ADD(a, b)       // Returns None (overflow detected)
saturated ← SATURATING_ADD(a, b)  // Returns U32_MAX

Return-Oriented Programming (ROP)

When the stack is non-executable (DEP/NX), attackers chain short instruction sequences ("gadgets") already present in the binary or libraries.

Gadget: A sequence ending in RET

gadget1: pop rdi; ret        ← Load argument into register
gadget2: pop rsi; ret        ← Load second argument
gadget3: call execve          ← System call

Stack layout (crafted by attacker):
┌───────────────────┐
│ addr of gadget1   │ → pop rdi; ret
│ "/bin/sh" address │ → loaded into rdi
│ addr of gadget2   │ → pop rsi; ret
│ 0x0               │ → loaded into rsi (argv = NULL)
│ addr of gadget3   │ → execve("/bin/sh", NULL, ...)
└───────────────────┘

ROP chain: Each ret pops the next address from the stack, chaining gadgets together into arbitrary computation. A Turing-complete attack technique.

JOP (Jump-Oriented Programming): Uses indirect jumps instead of returns. Bypasses return-address-focused defenses.

Mitigations

ASLR (Address Space Layout Randomization)

Randomizes the base addresses of the stack, heap, libraries, and executable at each program start.

Effectiveness: Attackers cannot predict addresses for ROP gadgets or shellcode. Requires an information leak to bypass — reading a pointer reveals the randomization offset, un-randomizing all addresses in that region.

Entropy: Typically 28-32 bits on 64-bit systems. On 32-bit systems, only ~16 bits — vulnerable to brute force.

DEP/NX (Data Execution Prevention / No-eXecute)

Marks memory pages as either writable or executable, never both (W^X policy).

Stack: Marked non-executable. Injected shellcode on the stack cannot run.

Bypass: ROP — reuses existing executable code rather than injecting new code.

Stack Canaries

A random value placed between local variables and the saved return address. Checked before the function returns.

Stack with canary:
┌─────────────────┐
│ Return address   │
├─────────────────┤
│ Saved EBP        │
├─────────────────┤
│ CANARY VALUE     │ ← Random; checked before ret
├─────────────────┤
│ Local variables  │
├─────────────────┤
│ char buf[64]     │ ← Overflow must overwrite canary to reach return addr
└─────────────────┘

Bypass: Information leak to read the canary value, or overwrite a function pointer that is called before the canary check.

Control-Flow Integrity (CFI)

Restricts indirect branches (calls, jumps, returns) to a set of valid targets determined at compile time.

Forward-edge CFI: Validates indirect call/jump targets. A virtual function call can only reach functions with a matching signature.

Backward-edge CFI: Validates return addresses (see Shadow Stacks below).

Implementations: Clang CFI, Microsoft Control Flow Guard (CFG), Intel CET (hardware).

Shadow Stacks

A separate, protected stack that stores only return addresses. On function return, the return address from the main stack is compared against the shadow stack.

Main Stack          Shadow Stack
┌──────────┐       ┌──────────┐
│ ret addr │ ←──── │ ret addr │  Must match
│ locals   │       └──────────┘
│ buf[64]  │
└──────────┘

Hardware support: Intel CET (Control-flow Enforcement Technology) implements shadow stacks in hardware. ARM has PAC (Pointer Authentication Codes) as an alternative — return addresses are cryptographically signed.

Mitigation Summary

Mitigation	Prevents	Bypassed By
DEP/NX	Code injection	ROP/JOP
ASLR	Predictable addresses	Information leaks
Stack canaries	Sequential stack overflow	Info leak, non-sequential overwrite
CFI	Invalid control-flow transfers	Targets within valid set
Shadow stacks	Return address overwrite	Requires shadow stack corruption
W^X + CFI + ASLR	Combined defense	Significantly harder to exploit

Reverse Engineering Basics

Reverse engineering analyzes compiled binaries to understand their behavior without source code.

Tools

Disassemblers: Convert machine code to assembly. IDA Pro, Ghidra (free, NSA), Binary Ninja.
Decompilers: Reconstruct higher-level code from assembly. Ghidra, Hex-Rays (IDA plugin).
Debuggers: Step through execution. GDB (+ GEF/pwndbg), WinDbg, LLDB.
Dynamic analysis: ltrace (library calls), strace (system calls), Frida (instrumentation).

Common Analysis Workflow

Static analysis: Load binary in Ghidra/IDA. Identify main, interesting functions, strings.
Identify protections: checksec reveals ASLR, NX, canaries, PIE, RELRO.
Dynamic analysis: Run under a debugger. Set breakpoints at key functions. Observe behavior.
Vulnerability identification: Look for unsafe functions (strcpy, sprintf, gets), unchecked lengths, integer arithmetic on sizes.

# Check binary protections
checksec --file=./target_binary
# RELRO:    Full RELRO
# Stack:    Canary found
# NX:       NX enabled
# PIE:      PIE enabled
# ASLR:     Enabled (system-wide)

Binary Formats

ELF (Executable and Linkable Format): Linux/Unix. Headers: ELF header, program headers (segments for loading), section headers (.text, .data, .bss, .plt, .got).

PE (Portable Executable): Windows. Headers: DOS header, PE header, section table (.text, .rdata, .data, .rsrc).

Key sections:

.text — Executable code
.data — Initialized global variables
.bss — Uninitialized global variables
.plt/.got — Procedure linkage / global offset table (dynamic linking)
.rodata — Read-only data (strings, constants)

Modern systems deploy mitigations in layers. No single defense is sufficient, but the combination of ASLR, DEP, stack canaries, CFI, and shadow stacks raises the cost of exploitation substantially, particularly on 64-bit systems with full RELRO and PIE.