4 min read
On this page

Network Programming

Network programming implements communication between processes across a network. The socket API is the universal interface.

Berkeley Sockets API

The socket is an endpoint for communication, identified by (IP address, port).

Socket Lifecycle (TCP Server)

// 1. Create socket + bind + listen
listener ← TCP_LISTEN("0.0.0.0:8080")

// 2. Accept connections
FOR EACH stream IN INCOMING(listener)
    // 3. Read/write data
    buf ← READ(stream, max: 1024)
    WRITE(stream, "HTTP/1.1 200 OK\r\n\r\nHello")

    // 4. Close (automatic when stream goes out of scope)

Socket Lifecycle (TCP Client)

stream ← TCP_CONNECT("example.com:80")
WRITE(stream, "GET / HTTP/1.1\r\nHost: example.com\r\n\r\n")

response ← READ_ALL(stream)
PRINT response

UDP

// Server
socket ← UDP_BIND("0.0.0.0:8080")
(data, src_addr) ← RECV_FROM(socket, max: 1024)
SEND_TO(socket, "reply", src_addr)

// Client
socket ← UDP_BIND("0.0.0.0:0")   // ephemeral port
SEND_TO(socket, "hello", "server:8080")

Socket Types

Type Protocol Description
SOCK_STREAM TCP Reliable, ordered byte stream
SOCK_DGRAM UDP Unreliable datagrams
SOCK_RAW IP Raw access (craft custom packets)

Non-Blocking I/O

Blocking (Default)

read() blocks until data arrives. accept() blocks until a connection arrives. Simple but one thread per connection.

Non-Blocking

Socket set to non-blocking mode. read() returns immediately (with data or EAGAIN/EWOULDBLOCK if no data).

SET_NONBLOCKING(stream, true)
result ← READ(stream, buf)
IF result is Ok(n)
    // got n bytes
ELSE IF result is WouldBlock
    // No data available right now
ELSE
    ERROR result

Problem: Must poll repeatedly — wastes CPU. Use I/O multiplexing instead.

I/O Multiplexing

Monitor multiple file descriptors simultaneously. Block until any is ready.

select()

fd_set readfds;
FD_ZERO(&readfds);
FD_SET(sockfd1, &readfds);
FD_SET(sockfd2, &readfds);
select(maxfd + 1, &readfds, NULL, NULL, &timeout);
if (FD_ISSET(sockfd1, &readfds)) { /* sockfd1 has data */ }

Limitations: O(n) scan of fd_set. Limited to FD_SETSIZE (typically 1024) fds. Must rebuild fd_set each call.

poll()

Like select but no fd limit. Array of pollfd structs.

Still O(n) per call — must scan all fds.

epoll (Linux)

Efficient for large numbers of fds. Event-driven — only returns ready fds.

// Using event polling (cross-platform epoll/kqueue wrapper)
poll ← NEW_POLL()
events ← NEW_EVENT_LIST(capacity: 1024)

server ← TCP_LISTEN("0.0.0.0:8080")
REGISTER(poll, server, token: 0, interest: READABLE)

LOOP
    POLL(poll, events, timeout: NONE)
    FOR EACH event IN events
        IF event.token = 0
            // new connection ready
        ELSE
            // data ready on connection

Key operations:

  • epoll_create: Create an epoll instance.
  • epoll_ctl: Add/modify/remove file descriptors.
  • epoll_wait: Wait for events. Returns only ready fds — O(number_of_ready_fds), not O(total_fds).

Edge-triggered vs Level-triggered:

  • Level-triggered (default): Event reported as long as condition holds (like select/poll).
  • Edge-triggered: Event reported only when state changes. More efficient but tricky (must drain all data on notification).

kqueue (BSD/macOS)

Similar to epoll. Unified event framework — handles sockets, files, signals, timers, processes.

int kq = kqueue();
struct kevent ev;
EV_SET(&ev, sockfd, EVFILT_READ, EV_ADD, 0, 0, NULL);
kevent(kq, &ev, 1, NULL, 0, NULL);

io_uring (Linux 5.1+)

Next generation. Shared ring buffers between user space and kernel. Zero-copy, zero-syscall operation submission/completion.

Much faster than epoll for high-throughput I/O. Covered in OS topic.

Asynchronous I/O

Rust Async Networking (Tokio)

ASYNC PROCEDURE MAIN()
    listener ← AWAIT TCP_LISTEN("0.0.0.0:8080")

    LOOP
        (socket, addr) ← AWAIT ACCEPT(listener)
        SPAWN_ASYNC(PROCEDURE()
            buf ← AWAIT READ(socket, max: 1024)
            AWAIT WRITE(socket, buf)
        )

Tokio handles thousands of concurrent connections on a small thread pool. Each spawned task is a lightweight future, not an OS thread.

Connection Pooling

Reuse connections instead of creating new ones for each request.

// Conceptual connection pool
CLASS Pool
    FIELDS: connections (list), max_size (integer)

    FUNCTION GET()
        IF connections is not empty
            RETURN POP(connections)
        ELSE
            RETURN CREATE_NEW_CONNECTION()

    PROCEDURE RETURN_CONN(conn)
        IF length(connections) < max_size
            APPEND conn TO connections

Benefits: Avoid TCP handshake + TLS handshake overhead per request. Limit connections to prevent exhaustion.

Libraries: r2d2 (Rust), HikariCP (Java), pgBouncer (PostgreSQL).

Protocol Design Considerations

Framing

TCP is a byte stream — no message boundaries. Application must define framing:

  • Length-prefixed: First N bytes encode message length. Read length, then read that many bytes.
  • Delimiter-based: Messages separated by a special byte/sequence (e.g., newline, null, CRLF CRLF).
  • Fixed-length: Each message is exactly N bytes.
  • Self-describing: Message format encodes its own length (protobuf, JSON with content-length).

Serialization Formats

Format Type Size Speed Schema Human Readable
JSON Text Large Moderate Optional Yes
Protocol Buffers Binary Small Fast Required (.proto) No
MessagePack Binary Small Fast No No
FlatBuffers Binary Small Very fast (zero-copy) Required No
CBOR Binary Small Fast No No
Cap'n Proto Binary Small Very fast (zero-copy) Required No

Protocol Buffers (Google): Dominant in microservices. Schema evolution (add/remove fields safely). Language-neutral.

FlatBuffers (Google): Zero-copy deserialization (access data directly in the buffer without parsing). Used in games, mobile apps, ML (TFLite).

Applications in CS

  • Web servers: nginx, Apache — event-driven I/O (epoll/kqueue). Tokio/Actix for Rust.
  • Databases: Custom wire protocols (PostgreSQL, MySQL). Connection pooling. Async replication.
  • Microservices: gRPC (protobuf over HTTP/2). Service mesh (Envoy proxy). Load balancing.
  • Game networking: UDP for game state. Custom reliability on top. Client-side prediction.
  • Chat/messaging: WebSocket for real-time. Long polling for compatibility.
  • Distributed systems: Custom protocols for consensus (Raft), replication, cluster management.