7 min read
On this page

Network Information Theory

Overview

Network information theory extends Shannon's point-to-point framework to multi-terminal settings: multiple senders, multiple receivers, or both. Unlike the single-user case, many multi-user capacity problems remain open. The field reveals fundamental tensions between cooperation, interference, and distributed operation.

Multiple Access Channel (MAC)

Multiple senders transmit to a single receiver. For a two-user DM-MAC with inputs X1,X2X_1, X_2 and output YY:

Capacity region (achievable rate pairs):

R1I(X1;YX2),R2I(X2;YX1),R1+R2I(X1,X2;Y)R_1 \leq I(X_1; Y | X_2), \quad R_2 \leq I(X_2; Y | X_1), \quad R_1 + R_2 \leq I(X_1, X_2; Y)

over all product distributions p(x1)p(x2)p(x_1)p(x_2) (no cooperation between encoders). The region is a pentagon (or degenerate cases).

Key properties:

  • Corner points achieved by successive decoding: decode one user treating the other as noise, subtract, decode the second
  • Time-sharing between corner points achieves the dominant face
  • Capacity region is convex (time-sharing achievable)
  • Sum capacity Csum=maxI(X1,X2;Y)C_{\text{sum}} = \max I(X_1, X_2; Y)

Gaussian MAC: Y=X1+X2+ZY = X_1 + X_2 + Z, ZN(0,N)Z \sim \mathcal{N}(0, N), power constraints P1,P2P_1, P_2:

R112log(1+P1N),R212log(1+P2N)R_1 \leq \frac{1}{2}\log\left(1 + \frac{P_1}{N}\right), \quad R_2 \leq \frac{1}{2}\log\left(1 + \frac{P_2}{N}\right) R1+R212log(1+P1+P2N)R_1 + R_2 \leq \frac{1}{2}\log\left(1 + \frac{P_1 + P_2}{N}\right)

Successive interference cancellation (SIC): practical scheme achieving corner points. Decode strongest user first, subtract from received signal, decode next. Foundation of NOMA in 5G.

Broadcast Channel (BC)

One sender transmits to multiple receivers, each seeing a different channel. For a two-user degraded BC (where XY1Y2X \to Y_1 \to Y_2 forms a Markov chain):

Superposition coding achieves capacity:

R1I(X;Y1U),R2I(U;Y2)R_1 \leq I(X; Y_1 | U), \quad R_2 \leq I(U; Y_2)

where UU is an auxiliary random variable representing the "cloud center" and XX given UU represents the "satellite codeword."

Gaussian BC: Yi=X+ZiY_i = X + Z_i, stronger user gets higher-rate information layered on top of weaker user's message. Capacity region characterized by power splitting parameter α\alpha:

R112log(1+αPN1),R212log(1+(1α)PαP+N2)R_1 \leq \frac{1}{2}\log\left(1 + \frac{\alpha P}{N_1}\right), \quad R_2 \leq \frac{1}{2}\log\left(1 + \frac{(1-\alpha)P}{\alpha P + N_2}\right)

MAC-BC duality: the capacity region of the Gaussian BC equals the capacity region of the dual MAC with a sum power constraint. This greatly simplifies computation.

Non-degraded BC: general capacity region unknown. Marton's inner bound (superposition + binning) is the best known achievable region. Converse remains open except for specific cases.

Relay Channel

Three-terminal channel: sender \to relay \to receiver, with a direct link from sender to receiver.

Key strategies:

  • Decode-and-forward (DF): relay fully decodes, re-encodes, and forwards. Optimal when relay channel is strong. Rate: Rmin{I(X,Xr;Y),I(X;YrXr)}R \leq \min\{I(X, X_r; Y), I(X; Y_r | X_r)\}
  • Compress-and-forward (CF): relay quantizes its observation and forwards compressed version. Useful when relay channel is weak. Uses Wyner-Ziv compression
  • Amplify-and-forward (AF): relay simply scales and retransmits (Gaussian channels). Simplest but suboptimal

Cut-set bound (upper bound): Rmaxp(x,xr)min{I(X,Xr;Y),I(X;Y,YrXr)}R \leq \max_{p(x,x_r)} \min\{I(X, X_r; Y), I(X; Y, Y_r | X_r)\}. DF achieves this for degraded relay channels.

The capacity of the general relay channel remains open. The gap between DF/CF achievable rates and the cut-set bound can be significant.

Multi-hop and diamond networks: extensions with multiple relays in series or parallel. Noisy network coding (Lim, Kim, El Gamal, Chung, 2011) unifies and improves upon CF for general relay networks.

Interference Channel

Two sender-receiver pairs, each causing interference to the other:

Y1=h11X1+h12X2+Z1,Y2=h21X1+h22X2+Z2Y_1 = h_{11}X_1 + h_{12}X_2 + Z_1, \quad Y_2 = h_{21}X_1 + h_{22}X_2 + Z_2

Capacity: known only in special cases:

  • Strong interference (h122h112|h_{12}|^2 \geq |h_{11}|^2 and h212h222|h_{21}|^2 \geq |h_{22}|^2): each receiver can decode both messages. Capacity = intersection of two MAC regions
  • Very strong interference: interference so strong that decoding it is "free." Capacity equals interference-free capacity
  • General case: capacity is unknown (one of the major open problems)

Han-Kobayashi scheme: each transmitter splits its message into common and private parts. Common parts are decoded by both receivers; private parts treated as noise by unintended receiver. Best known achievable region. Achieved within 1 bit of capacity for all Gaussian ICs (Etkin, Tse, Wang, 2008).

Interference alignment (Cadambe-Jafar, 2008): in KK-user interference channel with time/frequency-varying coefficients, each user can achieve 12log(1+SNR)\frac{1}{2}\log(1 + \text{SNR}) DoF. Total K/2K/2 degrees of freedom, double what treating interference as noise achieves. Requires infinite symbol extensions or sufficiently varying channels.

Wiretap Channel

Sender communicates to legitimate receiver while keeping message secret from eavesdropper. Models: X(Y,Z)X \to (Y, Z) where YY is the legitimate channel output and ZZ is the eavesdropper's observation.

Secrecy capacity (Wyner, 1975; Csiszar-Korner, 1978):

Cs=maxp(u,x)[I(U;Y)I(U;Z)]C_s = \max_{p(u,x)} [I(U; Y) - I(U; Z)]

For degraded channels (XYZX \to Y \to Z): Cs=maxp(x)[I(X;Y)I(X;Z)]C_s = \max_{p(x)} [I(X; Y) - I(X; Z)].

Gaussian wiretap: Cs=12log(1+P/NY)12log(1+P/NZ)C_s = \frac{1}{2}\log(1 + P/N_Y) - \frac{1}{2}\log(1 + P/N_Z) when NY<NZN_Y < N_Z (main channel better than eavesdropper). If NYNZN_Y \geq N_Z, secrecy capacity is zero.

Information-theoretic security vs. computational security:

  • Provides unconditional security (no assumptions about eavesdropper's computational power)
  • Secret key agreement via public discussion (Maurer, Ahlswede-Csiszar)
  • Connection to quantum key distribution (BB84)

Network Coding

Classical routing: intermediate nodes only store-and-forward packets. Network coding allows intermediate nodes to combine (encode) packets.

Linear Network Coding

Nodes perform linear operations over a finite field GF(qq). Each outgoing edge carries a linear combination of incoming messages.

Max-flow min-cut theorem for networks (Ahlswede, Cai, Li, Yeung, 2000): for a single multicast session, the capacity equals the minimum of the max-flows to each receiver. Network coding achieves this; routing generally does not.

Butterfly network: canonical example where network coding provides throughput gain. Two multicast receivers each need two source symbols. Routing achieves rate 1.5; network coding achieves rate 2.

Algebraic framework (Koetter-Medard, 2003): encode using transfer matrices over GF(qq). For multicast, a random linear code over sufficiently large field achieves capacity with high probability.

Random Linear Network Coding

Each node selects random coefficients for linear combinations over GF(qq).

  • Achieves multicast capacity with probability (1K/q)E\geq (1 - K/q)^{|\mathcal{E}|} where KK is number of receivers
  • Decentralized: nodes need no global knowledge
  • Practical protocol: append encoding vector as header. Receiver collects nn innovative (linearly independent) packets, solves via Gaussian elimination
  • Overhead: header size grows logarithmically in field size
  • Applications: P2P content distribution, wireless broadcast, disruption-tolerant networks

Practical Considerations

  • Batched network coding (BATS): Fountain-like codes adapted for multi-hop networks with recoding
  • Fulcrum codes: bridge between random network coding and Raptor codes
  • Coded caching (Maddah-Ali, Niesen): information-theoretic formulation of caching with coded multicast delivery. Achieves multiplicative gain in delivery rate

Distributed Source Coding

CEO Problem

LL agents each observe a noisy version YiY_i of a source XX. A central estimator (CEO) must reconstruct XX from the agents' compressed messages.

  • Quadratic Gaussian CEO: rate-distortion function known (Prabhakaran, Tse, Ramchandran). As LL \to \infty, achieves centralized distortion
  • General case: optimal rate-distortion region open beyond two terminals

Multi-Terminal Source Coding

LL correlated sources X1,,XLX_1, \ldots, X_L encoded separately, decoded jointly:

  • Lossless: Slepian-Wolf region iSRiH(XSXSc)\sum_{i \in S} R_i \geq H(X_S | X_{S^c}) for all SS
  • Lossy: Berger-Tung inner bound (random binning + joint typicality). Tight for some cases (Gaussian, specific distortion measures)

Information-Theoretic Limits for Wireless

MIMO Channel

Y=HX+Z\mathbf{Y} = \mathbf{H}\mathbf{X} + \mathbf{Z} with ntn_t transmit and nrn_r receive antennas.

  • Capacity (known H\mathbf{H} at receiver): C=maxtr(KX)Plogdet(I+1NHKXHH)C = \max_{\text{tr}(\mathbf{K}_X) \leq P} \log\det\left(\mathbf{I} + \frac{1}{N}\mathbf{H}\mathbf{K}_X\mathbf{H}^H\right)
  • Water-filling over singular values of H\mathbf{H} when CSI at transmitter
  • Degrees of freedom: min(nt,nr)\min(n_t, n_r) spatial multiplexing gain
  • Massive MIMO: as ntn_t \to \infty, channel hardens, interference averages out, simple matched-filter processing near-optimal

Fading Channels

  • Ergodic capacity: C=EH[log(1+H2SNR)]C = \mathbb{E}_H[\log(1 + |H|^2 \text{SNR})] (average over fading realizations)
  • Outage capacity: Cϵ=sup{R:P(log(1+H2SNR)<R)ϵ}C_\epsilon = \sup\{R : P(\log(1 + |H|^2 \text{SNR}) < R) \leq \epsilon\}
  • Diversity-multiplexing tradeoff (Zheng-Tse, 2003): fundamental tradeoff between reliability (diversity gain dd) and rate (multiplexing gain rr) in high-SNR regime

Open Problems

Several fundamental problems remain unsolved:

  1. Capacity of the general interference channel
  2. Capacity of the general broadcast channel (non-degraded)
  3. Capacity of the general relay channel
  4. Distributed lossy source coding (general rate-distortion region)
  5. Optimality of Gaussian signaling for vector Gaussian networks beyond known special cases
  6. Separation of source and channel coding in networks (does not hold in general, unlike point-to-point)

These open problems have driven decades of research and continue to motivate new techniques, including connections to theoretical computer science (hardness of approximation) and algebraic geometry.