Threads

A thread is a lightweight unit of execution within a process. Threads within the same process share the same address space but have independent execution stacks and register state.

Thread vs Process

Aspect	Process	Thread
Address space	Own (isolated)	Shared with other threads
Creation cost	Heavy (fork, COW pages)	Light (just stack + registers)
Context switch	Expensive (TLB flush)	Cheap (no address space switch)
Communication	IPC (pipes, sockets, shm)	Shared memory (direct)
Failure isolation	Crash doesn't affect others	Crash kills entire process
Memory overhead	Full address space	Just stack (~2-8 MB default)

Code (text segment)
Global data, heap
Open file descriptors
Signal handlers
Current working directory
User/group ID

What Threads Have Privately

Thread ID
Program counter and registers
Stack (each thread has its own)
Signal mask
errno (on many systems)
Thread-local storage (TLS)

Threading Models

User-Level Threads (M:1)

Many user threads mapped to one kernel thread. The thread library manages threads entirely in user space.

User space:  [T1] [T2] [T3] [T4]
                    |
Kernel:          [KT1]

Advantages: Fast creation and switching (no syscalls). Customizable scheduling. Disadvantages: One blocking syscall blocks ALL threads. Can't use multiple cores (kernel sees one thread).

Examples: Early green threads, cooperative multitasking libraries.

Kernel-Level Threads (1:1)

Each user thread maps to one kernel thread. The kernel manages all threads.

User space:  [T1] [T2] [T3] [T4]
              |    |    |    |
Kernel:     [KT1][KT2][KT3][KT4]

Advantages: True parallelism on multi-core. Blocking one thread doesn't block others. Disadvantages: Thread creation requires syscall. Context switch goes through kernel.

Examples: Linux (NPTL), Windows, macOS. This is the standard model today.

Linux implements threads as lightweight processes using clone() with shared address space.

Hybrid (M:N)

M user threads mapped to N kernel threads (M > N typically).

User space:  [T1] [T2] [T3] [T4] [T5] [T6]
              |    |    |    |    |    |
Kernel:     [KT1]   [KT2]   [KT3]

Advantages: Best of both worlds. More user threads than kernel threads. Flexible scheduling. Disadvantages: Complex implementation. Tricky synchronization between user and kernel schedulers.

Examples: Go goroutines (M goroutines on N OS threads), Erlang (M processes on N schedulers), early Solaris, Windows UMS.

POSIX Threads (pthreads)

The standard threading API on UNIX/Linux.

Thread Lifecycle

#include <pthread.h>

void *worker(void *arg) {
    int id = *(int *)arg;
    printf("Thread %d running\n", id);
    return NULL;
}

int main() {
    pthread_t threads[4];
    int ids[4] = {0, 1, 2, 3};

    for (int i = 0; i < 4; i++) {
        pthread_create(&threads[i], NULL, worker, &ids[i]);
    }
    for (int i = 0; i < 4; i++) {
        pthread_join(threads[i], NULL);  // wait for completion
    }
}

Rust Threads

PROCEDURE MAIN()
    handles ← empty list
    FOR id ← 0 TO 3
        h ← SPAWN_THREAD(PROCEDURE()
            PRINT "Thread ", id, " running"
        )
        APPEND h TO handles

    FOR EACH handle IN handles
        JOIN(handle)

Rust's ownership system prevents data races at compile time.

Thread Pools

Pre-create a fixed number of threads. Queue tasks for execution by pool threads.

// Using a thread pool for parallel iteration
results ← PARALLEL_MAP(0..1000, x -> EXPENSIVE_COMPUTATION(x))

Advantages: Avoid thread creation overhead. Limit resource usage. Automatic load balancing.

Sizing: Typically:

CPU-bound tasks: num_threads ≈ num_cores
I/O-bound tasks: num_threads ≈ num_cores × (1 + wait_time / compute_time)

Thread-Local Storage (TLS)

Variables that are private to each thread despite being "global" in code.

// Thread-local variable: each thread has its own copy
THREAD_LOCAL CACHE ← empty list

// Access the thread-local cache
APPEND "item" TO CACHE

Use cases: Per-thread caches, errno (on some systems), random number generators, memory allocators (per-thread arenas).

Green Threads

User-space threads managed by a runtime, not the OS kernel.

Go Goroutines

go func() {
    fmt.Println("Hello from goroutine")
}()

Initial stack: ~2 KB (grows dynamically, up to 1 GB)
M:N scheduling: goroutines multiplexed onto OS threads
Millions of goroutines on a few OS threads
Cooperative scheduling (yield at function calls, channel operations, syscalls)
Work-stealing scheduler: Idle threads steal goroutines from busy threads' run queues

Fibers

Cooperative user-space threads. Must explicitly yield. Used in game engines, coroutine libraries.

Fiber A: compute → yield → compute → yield
Fiber B:           compute → yield → compute

Coroutines

Generalizations of functions that can suspend and resume. Stackful (like fibers/green threads) or stackless (like Rust async).

Rust async/await: Stackless coroutines compiled to state machines. Very lightweight — each future is typically 100-200 bytes.

Threading Challenges

Data Races

Two threads access the same memory, at least one writes, without synchronization.

// Data race example — prevented at compile time by ownership rules
data ← [1, 2, 3]
SPAWN_THREAD(PROCEDURE()
    APPEND 4 TO data      // ERROR: can't modify data from another thread
)
PRINT data                 // while it's accessed here

Rust's type system (Send + Sync traits) prevents data races at compile time. Other languages rely on programmer discipline + runtime tools (ThreadSanitizer).

Technology	Creation Cost	Memory	Parallelism	Scheduling
OS process	~1 ms	~MB+	Yes (multi-core)	Preemptive (kernel)
OS thread	~10-100 μs	~2-8 MB stack	Yes (multi-core)	Preemptive (kernel)
Green thread (Go)	~1 μs	~2-8 KB stack	Yes (M:N)	Cooperative + preemptive
Async task (Rust)	~100 ns	~100-200 B	Yes (on thread pool)	Cooperative
Fiber	~1 μs	~KB	No (single thread)	Cooperative

Applications in CS

Web servers: Thread per connection (Apache), thread pool (Tomcat), async (nginx, Tokio).
Databases: Thread per connection (PostgreSQL backend processes), thread pool (MySQL).
GUI applications: Main thread for UI, worker threads for computation. UI must be updated from the main thread.
Game engines: Main thread (game logic), render thread, audio thread, worker pool (tasks).
Parallel algorithms: Parallel sort, parallel search, MapReduce workers.
Mobile apps: Main thread (UI) + background threads (network, computation). Android: strict main-thread rules.