5 min read
On this page

Processes

A process is a program in execution — the fundamental unit of work in an operating system. Each process has its own address space, execution state, and system resources.

Process Concept

Process vs Program

A program is a static file on disk (executable binary). A process is a running instance of a program with:

  • Code (text segment)
  • Data (global/static variables)
  • Heap (dynamically allocated memory)
  • Stack (function calls, local variables)
  • Execution state (registers, program counter)
  • OS resources (open files, sockets, signal handlers)

Multiple processes can run the same program simultaneously (each with its own state).

Memory Layout

High addresses
┌──────────────┐
│    Stack     │ ↓ grows downward
│     ...      │
│              │
│     ...      │
│    Heap      │ ↑ grows upward
├──────────────┤
│    BSS       │ uninitialized global data (zeroed)
├──────────────┤
│    Data      │ initialized global data
├──────────────┤
│    Text      │ program code (read-only)
└──────────────┘
Low addresses

Process States

Process state transition diagram

        ┌──────────┐
        │   New    │
        └────┬─────┘
             │ admitted
             ▼
        ┌──────────┐   dispatch    ┌──────────┐
    ┌──→│  Ready   │─────────────→│ Running  │──┐
    │   └──────────┘              └──────────┘  │
    │        ↑                    │    │    │    │
    │        │              interrupt  │    │    │
    │        │                    │    │    │    │
    │   I/O complete              │    │    │    │
    │   or event                  │    │    │    │
    │        │                    │    │ exit│    │
    │   ┌──────────┐              │    │    │    │
    └───│ Waiting  │←─────────────┘    │    │    │
        │(Blocked) │    I/O or wait    │    │    │
        └──────────┘                   │    │    │
                                       ▼    │    │
                                 ┌──────────┐   │
                                 │Terminated│   │
                                 └──────────┘   │
                                       ↑────────┘

| State | Description | |---|---| | New | Process being created | | Ready | Waiting for CPU assignment | | Running | Currently executing on CPU | | Waiting/Blocked | Waiting for I/O or event | | Terminated | Finished execution |

Process Control Block (PCB)

The OS maintains a PCB for each process containing:

| Field | Content | |---|---| | PID | Process identifier | | State | Current state (ready, running, etc.) | | Program counter | Address of next instruction | | CPU registers | Saved register values | | Memory info | Page table pointer, memory limits | | Scheduling info | Priority, scheduling queue pointers | | I/O info | Open file descriptors, I/O status | | Accounting | CPU time used, time limits | | Parent/children | PID of parent, list of children |

Context Switching

Switching the CPU from one process to another:

  1. Save the current process's state (registers, PC) to its PCB.
  2. Update the current process's state (running → ready/blocked).
  3. Select the next process (scheduler decision).
  4. Load the next process's state from its PCB.
  5. Switch the memory context (page table, TLB flush).
  6. Jump to the next process's saved PC.

Cost: ~1-10 μs on modern hardware. Dominated by:

  • Saving/restoring registers
  • TLB flush (especially costly — must refill on cache misses)
  • Cache pollution (new process's data replaces old process's cached data)

Process Creation

fork() (UNIX/Linux)

Creates a copy of the current process.

pid ← FORK()
IF pid = 0
    // Child process
    PRINT "Child PID: ", GETPID()
ELSE IF pid > 0
    // Parent process
    PRINT "Parent created child ", pid
    WAITPID(pid)
ELSE
    ERROR "fork failed"

After fork():

  • Child is an exact copy of parent (code, data, heap, stack, file descriptors).
  • Copy-on-write (COW): Pages are shared until one process writes — then the page is copied.
  • Child gets a new PID. fork() returns 0 to child, child's PID to parent.

exec() Family

Replaces the current process's code and data with a new program.

execvp("ls", args);  // Replace current process with "ls"
// If exec succeeds, the code below never executes

fork() + exec(): The standard pattern for creating a new process running a different program.

Shell:
1. fork() → child process
2. Child: exec("program") → replaces child's code with program
3. Parent: wait() → waits for child to finish

posix_spawn()

Combined fork+exec in one call. More efficient (avoids unnecessary copying).

Windows: CreateProcess()

Single API call that creates a process with a new program (no fork/exec split).

Process Termination

Normal exit: Process calls exit() or returns from main().

Error exit: Process encounters a fatal error and exits with non-zero status.

Killed: Another process sends a signal (kill, SIGTERM, SIGKILL).

Parent notification: Parent calls wait() or waitpid() to collect the child's exit status.

Zombie Process

A terminated child whose parent hasn't called wait(). The PCB remains in the process table (holding the exit status). Cleaned up when parent calls wait() or parent terminates.

Orphan Process

A child whose parent has terminated. Adopted by init (PID 1) or systemd, which will wait() for it.

Inter-Process Communication (IPC)

IPC mechanisms comparison — pipes, shared memory, sockets, signals

Pipes

Anonymous pipe: Unidirectional byte stream between related processes (parent-child).

ls | grep ".txt" | wc -l
(read_fd, write_fd) ← PIPE()
IF FORK() = 0
    // Child: writes to pipe
    CLOSE(read_fd)
    WRITE(write_fd, data)
ELSE
    // Parent: reads from pipe
    CLOSE(write_fd)
    READ(read_fd, buffer)

Named pipe (FIFO): Has a name in the file system. Unrelated processes can communicate.

mkfifo /tmp/myfifo
echo "hello" > /tmp/myfifo &   # writer
cat /tmp/myfifo                  # reader

Message Queues

Processes send and receive discrete messages. Messages can be prioritized.

POSIX: mq_open, mq_send, mq_receive. System V: msgget, msgsnd, msgrcv.

Shared Memory

Fastest IPC — processes map the same physical memory into their address spaces.

int shm_fd = shm_open("/my_shm", O_CREAT | O_RDWR, 0666);
ftruncate(shm_fd, SIZE);
void *ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
// Now ptr is shared between processes

Requires synchronization (semaphores, mutexes) since multiple processes access the same memory.

Memory-Mapped Files

Map a file into the address space. Changes to the mapping are written back to the file (for MAP_SHARED). Efficient for large file access and IPC.

Sockets

Communication endpoint. Supports both local (Unix domain sockets) and network (TCP/UDP) communication.

Unix domain sockets: IPC on the same machine. Faster than TCP (no network stack overhead). Used by Docker, X11, systemd.

Signals

Asynchronous notifications sent to a process.

| Signal | Default Action | Meaning | |---|---|---| | SIGINT (2) | Terminate | Ctrl+C | | SIGTERM (15) | Terminate | Polite kill request | | SIGKILL (9) | Terminate | Unconditional kill (can't be caught) | | SIGSEGV (11) | Core dump | Segmentation fault | | SIGCHLD (17) | Ignore | Child process terminated | | SIGSTOP (19) | Stop | Pause process (can't be caught) | | SIGCONT (18) | Continue | Resume stopped process | | SIGUSR1/2 | Terminate | User-defined |

Processes can catch signals (install a signal handler) for most signals. SIGKILL and SIGSTOP cannot be caught.

Process Hierarchy

UNIX Process Tree

init (PID 1) or systemd
├── login
│   └── bash
│       ├── vim
│       └── gcc
├── sshd
│   └── bash
│       └── python
├── cron
└── httpd
    ├── httpd (worker)
    └── httpd (worker)

Every process (except init) has a parent. The process tree is rooted at init/systemd.

pstree command shows the process tree. ps aux lists all processes.

Applications in CS

  • Shells: The shell is a process that fork+exec's commands. Pipes connect processes.
  • Web servers: Fork a child for each request (Apache prefork), or use threads/async (nginx).
  • Databases: PostgreSQL forks a backend process per connection. MySQL uses threads.
  • Build systems: Make/Bazel spawn compiler processes in parallel.
  • Containers: Containers are processes with isolated namespaces (PID, network, mount, etc.).
  • Init systems: systemd manages the lifecycle of all system services as processes.