Threads
A thread is a lightweight unit of execution within a process. Threads within the same process share the same address space but have independent execution stacks and register state.
Thread vs Process
| Aspect | Process | Thread | |---|---|---| | Address space | Own (isolated) | Shared with other threads | | Creation cost | Heavy (fork, COW pages) | Light (just stack + registers) | | Context switch | Expensive (TLB flush) | Cheap (no address space switch) | | Communication | IPC (pipes, sockets, shm) | Shared memory (direct) | | Failure isolation | Crash doesn't affect others | Crash kills entire process | | Memory overhead | Full address space | Just stack (~2-8 MB default) |
What Threads Share
- Code (text segment)
- Global data, heap
- Open file descriptors
- Signal handlers
- Current working directory
- User/group ID
What Threads Have Privately
- Thread ID
- Program counter and registers
- Stack (each thread has its own)
- Signal mask
- errno (on many systems)
- Thread-local storage (TLS)
Threading Models
User-Level Threads (M:1)
Many user threads mapped to one kernel thread. The thread library manages threads entirely in user space.
User space: [T1] [T2] [T3] [T4]
|
Kernel: [KT1]
Advantages: Fast creation and switching (no syscalls). Customizable scheduling. Disadvantages: One blocking syscall blocks ALL threads. Can't use multiple cores (kernel sees one thread).
Examples: Early green threads, cooperative multitasking libraries.
Kernel-Level Threads (1:1)
Each user thread maps to one kernel thread. The kernel manages all threads.
User space: [T1] [T2] [T3] [T4]
| | | |
Kernel: [KT1][KT2][KT3][KT4]
Advantages: True parallelism on multi-core. Blocking one thread doesn't block others. Disadvantages: Thread creation requires syscall. Context switch goes through kernel.
Examples: Linux (NPTL), Windows, macOS. This is the standard model today.
Linux implements threads as lightweight processes using clone() with shared address space.
Hybrid (M:N)
M user threads mapped to N kernel threads (M > N typically).
User space: [T1] [T2] [T3] [T4] [T5] [T6]
| | | | | |
Kernel: [KT1] [KT2] [KT3]
Advantages: Best of both worlds. More user threads than kernel threads. Flexible scheduling. Disadvantages: Complex implementation. Tricky synchronization between user and kernel schedulers.
Examples: Go goroutines (M goroutines on N OS threads), Erlang (M processes on N schedulers), early Solaris, Windows UMS.
POSIX Threads (pthreads)
The standard threading API on UNIX/Linux.
Thread Lifecycle
#include <pthread.h>
void *worker(void *arg) {
int id = *(int *)arg;
printf("Thread %d running\n", id);
return NULL;
}
int main() {
pthread_t threads[4];
int ids[4] = {0, 1, 2, 3};
for (int i = 0; i < 4; i++) {
pthread_create(&threads[i], NULL, worker, &ids[i]);
}
for (int i = 0; i < 4; i++) {
pthread_join(threads[i], NULL); // wait for completion
}
}
Rust Threads
PROCEDURE MAIN()
handles ← empty list
FOR id ← 0 TO 3
h ← SPAWN_THREAD(PROCEDURE()
PRINT "Thread ", id, " running"
)
APPEND h TO handles
FOR EACH handle IN handles
JOIN(handle)
Rust's ownership system prevents data races at compile time.
Thread Pools
Pre-create a fixed number of threads. Queue tasks for execution by pool threads.
// Using a thread pool for parallel iteration
results ← PARALLEL_MAP(0..1000, x -> EXPENSIVE_COMPUTATION(x))
Advantages: Avoid thread creation overhead. Limit resource usage. Automatic load balancing.
Sizing: Typically:
- CPU-bound tasks: num_threads ≈ num_cores
- I/O-bound tasks: num_threads ≈ num_cores × (1 + wait_time / compute_time)
Thread-Local Storage (TLS)
Variables that are private to each thread despite being "global" in code.
// Thread-local variable: each thread has its own copy
THREAD_LOCAL CACHE ← empty list
// Access the thread-local cache
APPEND "item" TO CACHE
Use cases: Per-thread caches, errno (on some systems), random number generators, memory allocators (per-thread arenas).
Green Threads
User-space threads managed by a runtime, not the OS kernel.
Go Goroutines
go func() {
fmt.Println("Hello from goroutine")
}()
- Initial stack: ~2 KB (grows dynamically, up to 1 GB)
- M:N scheduling: goroutines multiplexed onto OS threads
- Millions of goroutines on a few OS threads
- Cooperative scheduling (yield at function calls, channel operations, syscalls)
- Work-stealing scheduler: Idle threads steal goroutines from busy threads' run queues
Fibers
Cooperative user-space threads. Must explicitly yield. Used in game engines, coroutine libraries.
Fiber A: compute → yield → compute → yield
Fiber B: compute → yield → compute
Coroutines
Generalizations of functions that can suspend and resume. Stackful (like fibers/green threads) or stackless (like Rust async).
Rust async/await: Stackless coroutines compiled to state machines. Very lightweight — each future is typically 100-200 bytes.
Threading Challenges
Data Races
Two threads access the same memory, at least one writes, without synchronization.
// Data race example — prevented at compile time by ownership rules
data ← [1, 2, 3]
SPAWN_THREAD(PROCEDURE()
APPEND 4 TO data // ERROR: can't modify data from another thread
)
PRINT data // while it's accessed here
Rust's type system (Send + Sync traits) prevents data races at compile time. Other languages rely on programmer discipline + runtime tools (ThreadSanitizer).
Deadlock
Thread A holds lock X, waits for lock Y. Thread B holds lock Y, waits for lock X. Both wait forever.
Priority Inversion
A low-priority thread holds a lock needed by a high-priority thread. The high-priority thread is blocked by the low-priority thread.
Solutions: Priority inheritance (boost the lock-holding thread's priority), priority ceiling.
Thread Safety
A function/data structure is thread-safe if it can be safely called from multiple threads simultaneously.
Strategies: Locks, lock-free algorithms, immutability, thread-local storage, message passing.
Comparison
| Technology | Creation Cost | Memory | Parallelism | Scheduling | |---|---|---|---|---| | OS process | ~1 ms | ~MB+ | Yes (multi-core) | Preemptive (kernel) | | OS thread | ~10-100 μs | ~2-8 MB stack | Yes (multi-core) | Preemptive (kernel) | | Green thread (Go) | ~1 μs | ~2-8 KB stack | Yes (M:N) | Cooperative + preemptive | | Async task (Rust) | ~100 ns | ~100-200 B | Yes (on thread pool) | Cooperative | | Fiber | ~1 μs | ~KB | No (single thread) | Cooperative |
Applications in CS
- Web servers: Thread per connection (Apache), thread pool (Tomcat), async (nginx, Tokio).
- Databases: Thread per connection (PostgreSQL backend processes), thread pool (MySQL).
- GUI applications: Main thread for UI, worker threads for computation. UI must be updated from the main thread.
- Game engines: Main thread (game logic), render thread, audio thread, worker pool (tasks).
- Parallel algorithms: Parallel sort, parallel search, MapReduce workers.
- Mobile apps: Main thread (UI) + background threads (network, computation). Android: strict main-thread rules.