Network Information Theory

Overview

Network information theory extends Shannon's point-to-point framework to multi-terminal settings: multiple senders, multiple receivers, or both. Unlike the single-user case, many multi-user capacity problems remain open. The field reveals fundamental tensions between cooperation, interference, and distributed operation.

Multiple Access Channel (MAC)

Multiple senders transmit to a single receiver. For a two-user DM-MAC with inputs $X_1, X_2$ and output $Y$ :

Capacity region (achievable rate pairs):

$R_1 \leq I(X_1; Y | X_2), \quad R_2 \leq I(X_2; Y | X_1), \quad R_1 + R_2 \leq I(X_1, X_2; Y)$

over all product distributions $p(x_1)p(x_2)$ (no cooperation between encoders). The region is a pentagon (or degenerate cases).

Key properties:

Corner points achieved by successive decoding: decode one user treating the other as noise, subtract, decode the second
Time-sharing between corner points achieves the dominant face
Capacity region is convex (time-sharing achievable)
Sum capacity $C_{\text{sum}} = \max I(X_1, X_2; Y)$

Gaussian MAC: $Y = X_1 + X_2 + Z$ , $Z \sim \mathcal{N}(0, N)$ , power constraints $P_1, P_2$ :

$R_1 \leq \frac{1}{2}\log\left(1 + \frac{P_1}{N}\right), \quad R_2 \leq \frac{1}{2}\log\left(1 + \frac{P_2}{N}\right)$ $R_1 + R_2 \leq \frac{1}{2}\log\left(1 + \frac{P_1 + P_2}{N}\right)$

Successive interference cancellation (SIC): practical scheme achieving corner points. Decode strongest user first, subtract from received signal, decode next. Foundation of NOMA in 5G.

Broadcast Channel (BC)

One sender transmits to multiple receivers, each seeing a different channel. For a two-user degraded BC (where $X \to Y_1 \to Y_2$ forms a Markov chain):

Superposition coding achieves capacity:

$R_1 \leq I(X; Y_1 | U), \quad R_2 \leq I(U; Y_2)$

where $U$ is an auxiliary random variable representing the "cloud center" and $X$ given $U$ represents the "satellite codeword."

Gaussian BC: $Y_i = X + Z_i$ , stronger user gets higher-rate information layered on top of weaker user's message. Capacity region characterized by power splitting parameter $\alpha$ :

$R_1 \leq \frac{1}{2}\log\left(1 + \frac{\alpha P}{N_1}\right), \quad R_2 \leq \frac{1}{2}\log\left(1 + \frac{(1-\alpha)P}{\alpha P + N_2}\right)$

MAC-BC duality: the capacity region of the Gaussian BC equals the capacity region of the dual MAC with a sum power constraint. This greatly simplifies computation.

Non-degraded BC: general capacity region unknown. Marton's inner bound (superposition + binning) is the best known achievable region. Converse remains open except for specific cases.

Relay Channel

Three-terminal channel: sender $\to$ relay $\to$ receiver, with a direct link from sender to receiver.

Key strategies:

Decode-and-forward (DF): relay fully decodes, re-encodes, and forwards. Optimal when relay channel is strong. Rate: $R \leq \min\{I(X, X_r; Y), I(X; Y_r | X_r)\}$
Compress-and-forward (CF): relay quantizes its observation and forwards compressed version. Useful when relay channel is weak. Uses Wyner-Ziv compression
Amplify-and-forward (AF): relay simply scales and retransmits (Gaussian channels). Simplest but suboptimal

Cut-set bound (upper bound): $R \leq \max_{p(x,x_r)} \min\{I(X, X_r; Y), I(X; Y, Y_r | X_r)\}$ . DF achieves this for degraded relay channels.

The capacity of the general relay channel remains open. The gap between DF/CF achievable rates and the cut-set bound can be significant.

Multi-hop and diamond networks: extensions with multiple relays in series or parallel. Noisy network coding (Lim, Kim, El Gamal, Chung, 2011) unifies and improves upon CF for general relay networks.

Interference Channel

Two sender-receiver pairs, each causing interference to the other:

$Y_1 = h_{11}X_1 + h_{12}X_2 + Z_1, \quad Y_2 = h_{21}X_1 + h_{22}X_2 + Z_2$

Capacity: known only in special cases:

Strong interference ( $|h_{12}|^2 \geq |h_{11}|^2$ and $|h_{21}|^2 \geq |h_{22}|^2$ ): each receiver can decode both messages. Capacity = intersection of two MAC regions
Very strong interference: interference so strong that decoding it is "free." Capacity equals interference-free capacity
General case: capacity is unknown (one of the major open problems)

Han-Kobayashi scheme: each transmitter splits its message into common and private parts. Common parts are decoded by both receivers; private parts treated as noise by unintended receiver. Best known achievable region. Achieved within 1 bit of capacity for all Gaussian ICs (Etkin, Tse, Wang, 2008).

Interference alignment (Cadambe-Jafar, 2008): in $K$ -user interference channel with time/frequency-varying coefficients, each user can achieve $\frac{1}{2}\log(1 + \text{SNR})$ DoF. Total $K/2$ degrees of freedom, double what treating interference as noise achieves. Requires infinite symbol extensions or sufficiently varying channels.

Wiretap Channel

Sender communicates to legitimate receiver while keeping message secret from eavesdropper. Models: $X \to (Y, Z)$ where $Y$ is the legitimate channel output and $Z$ is the eavesdropper's observation.

Secrecy capacity (Wyner, 1975; Csiszar-Korner, 1978):

$C_s = \max_{p(u,x)} [I(U; Y) - I(U; Z)]$

For degraded channels ( $X \to Y \to Z$ ): $C_s = \max_{p(x)} [I(X; Y) - I(X; Z)]$ .

Gaussian wiretap: $C_s = \frac{1}{2}\log(1 + P/N_Y) - \frac{1}{2}\log(1 + P/N_Z)$ when $N_Y < N_Z$ (main channel better than eavesdropper). If $N_Y \geq N_Z$ , secrecy capacity is zero.

Information-theoretic security vs. computational security:

Provides unconditional security (no assumptions about eavesdropper's computational power)
Secret key agreement via public discussion (Maurer, Ahlswede-Csiszar)
Connection to quantum key distribution (BB84)

Network Coding

Classical routing: intermediate nodes only store-and-forward packets. Network coding allows intermediate nodes to combine (encode) packets.

Linear Network Coding

Nodes perform linear operations over a finite field GF( $q$ ). Each outgoing edge carries a linear combination of incoming messages.

Max-flow min-cut theorem for networks (Ahlswede, Cai, Li, Yeung, 2000): for a single multicast session, the capacity equals the minimum of the max-flows to each receiver. Network coding achieves this; routing generally does not.

Butterfly network: canonical example where network coding provides throughput gain. Two multicast receivers each need two source symbols. Routing achieves rate 1.5; network coding achieves rate 2.

Algebraic framework (Koetter-Medard, 2003): encode using transfer matrices over GF( $q$ ). For multicast, a random linear code over sufficiently large field achieves capacity with high probability.

Random Linear Network Coding

Each node selects random coefficients for linear combinations over GF( $q$ ).

Achieves multicast capacity with probability $\geq (1 - K/q)^{|\mathcal{E}|}$ where $K$ is number of receivers
Decentralized: nodes need no global knowledge
Practical protocol: append encoding vector as header. Receiver collects $n$ innovative (linearly independent) packets, solves via Gaussian elimination
Overhead: header size grows logarithmically in field size
Applications: P2P content distribution, wireless broadcast, disruption-tolerant networks

Practical Considerations

Batched network coding (BATS): Fountain-like codes adapted for multi-hop networks with recoding
Fulcrum codes: bridge between random network coding and Raptor codes
Coded caching (Maddah-Ali, Niesen): information-theoretic formulation of caching with coded multicast delivery. Achieves multiplicative gain in delivery rate

Distributed Source Coding

CEO Problem

$L$ agents each observe a noisy version $Y_i$ of a source $X$ . A central estimator (CEO) must reconstruct $X$ from the agents' compressed messages.

Quadratic Gaussian CEO: rate-distortion function known (Prabhakaran, Tse, Ramchandran). As $L \to \infty$ , achieves centralized distortion
General case: optimal rate-distortion region open beyond two terminals

Multi-Terminal Source Coding

$L$ correlated sources $X_1, \ldots, X_L$ encoded separately, decoded jointly:

Lossless: Slepian-Wolf region $\sum_{i \in S} R_i \geq H(X_S | X_{S^c})$ for all $S$
Lossy: Berger-Tung inner bound (random binning + joint typicality). Tight for some cases (Gaussian, specific distortion measures)

Information-Theoretic Limits for Wireless

MIMO Channel

$\mathbf{Y} = \mathbf{H}\mathbf{X} + \mathbf{Z}$ with $n_t$ transmit and $n_r$ receive antennas.

Capacity (known $\mathbf{H}$ at receiver): $C = \max_{\text{tr}(\mathbf{K}_X) \leq P} \log\det\left(\mathbf{I} + \frac{1}{N}\mathbf{H}\mathbf{K}_X\mathbf{H}^H\right)$
Water-filling over singular values of $\mathbf{H}$ when CSI at transmitter
Degrees of freedom: $\min(n_t, n_r)$ spatial multiplexing gain
Massive MIMO: as $n_t \to \infty$ , channel hardens, interference averages out, simple matched-filter processing near-optimal

Fading Channels

Ergodic capacity: $C = \mathbb{E}_H[\log(1 + |H|^2 \text{SNR})]$ (average over fading realizations)
Outage capacity: $C_\epsilon = \sup\{R : P(\log(1 + |H|^2 \text{SNR}) < R) \leq \epsilon\}$
Diversity-multiplexing tradeoff (Zheng-Tse, 2003): fundamental tradeoff between reliability (diversity gain $d$ ) and rate (multiplexing gain $r$ ) in high-SNR regime

Open Problems

Several fundamental problems remain unsolved:

Capacity of the general interference channel
Capacity of the general broadcast channel (non-degraded)
Capacity of the general relay channel
Distributed lossy source coding (general rate-distortion region)
Optimality of Gaussian signaling for vector Gaussian networks beyond known special cases
Separation of source and channel coding in networks (does not hold in general, unlike point-to-point)

These open problems have driven decades of research and continue to motivate new techniques, including connections to theoretical computer science (hardness of approximation) and algebraic geometry.