Network Programming
Network programming implements communication between processes across a network. The socket API is the universal interface.
Berkeley Sockets API
The socket is an endpoint for communication, identified by (IP address, port).
Socket Lifecycle (TCP Server)
// 1. Create socket + bind + listen
listener ← TCP_LISTEN("0.0.0.0:8080")
// 2. Accept connections
FOR EACH stream IN INCOMING(listener)
// 3. Read/write data
buf ← READ(stream, max: 1024)
WRITE(stream, "HTTP/1.1 200 OK\r\n\r\nHello")
// 4. Close (automatic when stream goes out of scope)
Socket Lifecycle (TCP Client)
stream ← TCP_CONNECT("example.com:80")
WRITE(stream, "GET / HTTP/1.1\r\nHost: example.com\r\n\r\n")
response ← READ_ALL(stream)
PRINT response
UDP
// Server
socket ← UDP_BIND("0.0.0.0:8080")
(data, src_addr) ← RECV_FROM(socket, max: 1024)
SEND_TO(socket, "reply", src_addr)
// Client
socket ← UDP_BIND("0.0.0.0:0") // ephemeral port
SEND_TO(socket, "hello", "server:8080")
Socket Types
| Type | Protocol | Description | |---|---|---| | SOCK_STREAM | TCP | Reliable, ordered byte stream | | SOCK_DGRAM | UDP | Unreliable datagrams | | SOCK_RAW | IP | Raw access (craft custom packets) |
Non-Blocking I/O
Blocking (Default)
read() blocks until data arrives. accept() blocks until a connection arrives. Simple but one thread per connection.
Non-Blocking
Socket set to non-blocking mode. read() returns immediately (with data or EAGAIN/EWOULDBLOCK if no data).
SET_NONBLOCKING(stream, true)
result ← READ(stream, buf)
IF result is Ok(n)
// got n bytes
ELSE IF result is WouldBlock
// No data available right now
ELSE
ERROR result
Problem: Must poll repeatedly — wastes CPU. Use I/O multiplexing instead.
I/O Multiplexing
Monitor multiple file descriptors simultaneously. Block until any is ready.
select()
fd_set readfds;
FD_ZERO(&readfds);
FD_SET(sockfd1, &readfds);
FD_SET(sockfd2, &readfds);
select(maxfd + 1, &readfds, NULL, NULL, &timeout);
if (FD_ISSET(sockfd1, &readfds)) { /* sockfd1 has data */ }
Limitations: O(n) scan of fd_set. Limited to FD_SETSIZE (typically 1024) fds. Must rebuild fd_set each call.
poll()
Like select but no fd limit. Array of pollfd structs.
Still O(n) per call — must scan all fds.
epoll (Linux)
Efficient for large numbers of fds. Event-driven — only returns ready fds.
// Using event polling (cross-platform epoll/kqueue wrapper)
poll ← NEW_POLL()
events ← NEW_EVENT_LIST(capacity: 1024)
server ← TCP_LISTEN("0.0.0.0:8080")
REGISTER(poll, server, token: 0, interest: READABLE)
LOOP
POLL(poll, events, timeout: NONE)
FOR EACH event IN events
IF event.token = 0
// new connection ready
ELSE
// data ready on connection
Key operations:
epoll_create: Create an epoll instance.epoll_ctl: Add/modify/remove file descriptors.epoll_wait: Wait for events. Returns only ready fds — O(number_of_ready_fds), not O(total_fds).
Edge-triggered vs Level-triggered:
- Level-triggered (default): Event reported as long as condition holds (like select/poll).
- Edge-triggered: Event reported only when state changes. More efficient but tricky (must drain all data on notification).
kqueue (BSD/macOS)
Similar to epoll. Unified event framework — handles sockets, files, signals, timers, processes.
int kq = kqueue();
struct kevent ev;
EV_SET(&ev, sockfd, EVFILT_READ, EV_ADD, 0, 0, NULL);
kevent(kq, &ev, 1, NULL, 0, NULL);
io_uring (Linux 5.1+)
Next generation. Shared ring buffers between user space and kernel. Zero-copy, zero-syscall operation submission/completion.
Much faster than epoll for high-throughput I/O. Covered in OS topic.
Asynchronous I/O
Rust Async Networking (Tokio)
ASYNC PROCEDURE MAIN()
listener ← AWAIT TCP_LISTEN("0.0.0.0:8080")
LOOP
(socket, addr) ← AWAIT ACCEPT(listener)
SPAWN_ASYNC(PROCEDURE()
buf ← AWAIT READ(socket, max: 1024)
AWAIT WRITE(socket, buf)
)
Tokio handles thousands of concurrent connections on a small thread pool. Each spawned task is a lightweight future, not an OS thread.
Connection Pooling
Reuse connections instead of creating new ones for each request.
// Conceptual connection pool
CLASS Pool
FIELDS: connections (list), max_size (integer)
FUNCTION GET()
IF connections is not empty
RETURN POP(connections)
ELSE
RETURN CREATE_NEW_CONNECTION()
PROCEDURE RETURN_CONN(conn)
IF length(connections) < max_size
APPEND conn TO connections
Benefits: Avoid TCP handshake + TLS handshake overhead per request. Limit connections to prevent exhaustion.
Libraries: r2d2 (Rust), HikariCP (Java), pgBouncer (PostgreSQL).
Protocol Design Considerations
Framing
TCP is a byte stream — no message boundaries. Application must define framing:
- Length-prefixed: First N bytes encode message length. Read length, then read that many bytes.
- Delimiter-based: Messages separated by a special byte/sequence (e.g., newline, null, CRLF CRLF).
- Fixed-length: Each message is exactly N bytes.
- Self-describing: Message format encodes its own length (protobuf, JSON with content-length).
Serialization Formats
| Format | Type | Size | Speed | Schema | Human Readable | |---|---|---|---|---|---| | JSON | Text | Large | Moderate | Optional | Yes | | Protocol Buffers | Binary | Small | Fast | Required (.proto) | No | | MessagePack | Binary | Small | Fast | No | No | | FlatBuffers | Binary | Small | Very fast (zero-copy) | Required | No | | CBOR | Binary | Small | Fast | No | No | | Cap'n Proto | Binary | Small | Very fast (zero-copy) | Required | No |
Protocol Buffers (Google): Dominant in microservices. Schema evolution (add/remove fields safely). Language-neutral.
FlatBuffers (Google): Zero-copy deserialization (access data directly in the buffer without parsing). Used in games, mobile apps, ML (TFLite).
Applications in CS
- Web servers: nginx, Apache — event-driven I/O (epoll/kqueue). Tokio/Actix for Rust.
- Databases: Custom wire protocols (PostgreSQL, MySQL). Connection pooling. Async replication.
- Microservices: gRPC (protobuf over HTTP/2). Service mesh (Envoy proxy). Load balancing.
- Game networking: UDP for game state. Custom reliability on top. Client-side prediction.
- Chat/messaging: WebSocket for real-time. Long polling for compatibility.
- Distributed systems: Custom protocols for consensus (Raft), replication, cluster management.