OTP Patterns Beyond GenServer
GenServer and Supervisor are the foundation, but production OTP code rarely stops there. Once you have hundreds of processes representing chat sessions, IoT devices, or background jobs, you need patterns for discovering them, spawning them on demand, and coordinating work between them. This chapter covers the patterns that show up in every real Elixir codebase: Agent, Registry, DynamicSupervisor, GenStage and Flow, and Phoenix.PubSub.
The thing to internalize: each of these solves a specific coordination problem. Reaching for a custom GenServer when a Registry would do is a sign you haven't seen the patterns yet. Reaching for GenStage when a simple Task would do is a sign you've seen too many.
Agent: When You Just Need State
Agent is GenServer with the boilerplate stripped off. You can implement everything Agent does with GenServer, and most teams eventually do. But for a key-value store, a config cache, a counter, or any "stateful blob" that doesn't have meaningful behavior of its own, Agent is faster to write and easier to read.
defmodule FeatureFlags do
use Agent
def start_link(_opts) do
Agent.start_link(fn -> %{} end, name: __MODULE__)
end
def enabled?(flag) do
Agent.get(__MODULE__, &Map.get(&1, flag, false))
end
def set(flag, value) do
Agent.update(__MODULE__, &Map.put(&1, flag, value))
end
end
Two things to know. First, Agent.get/2 and Agent.update/2 run their function inside the agent process — that means a slow function blocks every other client. For anything that touches the network or filesystem, fetch the state with get, do the slow work in the caller, then update with the result.
Second, Agent serializes all access. If you're seeing contention, you don't want a faster Agent — you want ETS.
Registry: Process Discovery Without Naming Hell
You can register a process with a name, but a single atom only refers to one process. When you have a process per user and a million users, you can't atomize a million names — you'd crash the VM. Registry is the answer.
defmodule MyApp.SessionRegistry do
def child_spec(_) do
Registry.child_spec(keys: :unique, name: __MODULE__)
end
end
# in your supervisor
children = [
MyApp.SessionRegistry,
# ...
]
# starting a session for user 42
GenServer.start_link(MyApp.Session, user_id, name: via(42))
defp via(user_id) do
{:via, Registry, {MyApp.SessionRegistry, user_id}}
end
# calling it later
GenServer.call(via(42), :get_state)
The {:via, ...} tuple works anywhere a name is expected. Registry handles the lookup, and because it uses ETS internally, lookups are fast and concurrent — Discord registers tens of millions of processes this way.
Registry also supports :duplicate keys, which makes it a lightweight pub-sub for in-process coordination:
Registry.start_link(keys: :duplicate, name: MyApp.Events)
# subscribers
Registry.register(MyApp.Events, "user:42", :nothing)
# publishers
Registry.dispatch(MyApp.Events, "user:42", fn entries ->
for {pid, _} <- entries, do: send(pid, {:event, "hello"})
end)
For pub-sub across nodes, use Phoenix.PubSub instead — Registry is single-node only.
DynamicSupervisor: Children You Don't Know About at Boot
A regular Supervisor takes a fixed list of children at init time. DynamicSupervisor takes none — you add children to it at runtime. This is what you want for "one process per user," "one process per game session," or "one worker per job."
defmodule MyApp.SessionSupervisor do
use DynamicSupervisor
def start_link(opts) do
DynamicSupervisor.start_link(__MODULE__, opts, name: __MODULE__)
end
def start_session(user_id) do
spec = {MyApp.Session, user_id}
DynamicSupervisor.start_child(__MODULE__, spec)
end
@impl true
def init(_opts) do
DynamicSupervisor.init(strategy: :one_for_one)
end
end
Combined with Registry, this is the standard pattern for "find or create a process for this entity":
defmodule MyApp.SessionManager do
alias MyApp.{Session, SessionRegistry, SessionSupervisor}
def get_or_start(user_id) do
case Registry.lookup(SessionRegistry, user_id) do
[{pid, _}] ->
{:ok, pid}
[] ->
case SessionSupervisor.start_session(user_id) do
{:ok, pid} -> {:ok, pid}
{:error, {:already_started, pid}} -> {:ok, pid}
error -> error
end
end
end
end
The :already_started clause handles the race where two processes try to start a session for the same user simultaneously. One wins, the other gets back the winner's pid. Registry's uniqueness guarantee makes this safe.
GenStage and Flow: Backpressured Pipelines
When you have a fast producer and a slower consumer, naive concurrency leads to memory blowing up — the producer hands work to the consumer faster than the consumer can finish, and the queue grows without bound. GenStage solves this with demand-driven flow: consumers ask for work, producers send only what's been asked for.
defmodule MyApp.NumberProducer do
use GenStage
def start_link(_), do: GenStage.start_link(__MODULE__, 0)
def init(state), do: {:producer, state}
def handle_demand(demand, counter) do
events = Enum.to_list(counter..(counter + demand - 1))
{:noreply, events, counter + demand}
end
end
defmodule MyApp.NumberConsumer do
use GenStage
def start_link(_), do: GenStage.start_link(__MODULE__, :ok)
def init(:ok) do
{:consumer, :ok, subscribe_to: [MyApp.NumberProducer]}
end
def handle_events(events, _from, state) do
Enum.each(events, &IO.inspect/1)
{:noreply, [], state}
end
end
You'll rarely write GenStage directly. Flow is the higher-level API built on top of it — parallel data processing with backpressure baked in:
File.stream!("huge_log.txt")
|> Flow.from_enumerable(stages: 8)
|> Flow.filter(&String.contains?(&1, "ERROR"))
|> Flow.map(&parse_log_line/1)
|> Flow.partition(key: {:key, :user_id})
|> Flow.reduce(fn -> %{} end, fn line, acc ->
Map.update(acc, line.user_id, 1, &(&1 + 1))
end)
|> Enum.to_list()
This processes the file across 8 stages in parallel, partitions by user ID so all events for a user end up in the same reducer, and counts errors per user. With backpressure, you don't have to worry about reading the file faster than you can process it.
Use Flow when you have CPU-bound or memory-bound batch work. For I/O-heavy fan-out where each item is independent, Task.async_stream/3 is simpler and usually enough:
ids
|> Task.async_stream(&fetch_user/1, max_concurrency: 50, timeout: 10_000)
|> Enum.to_list()
Broadway
Worth mentioning: Broadway is a layer above GenStage built specifically for consuming from message brokers (SQS, RabbitMQ, Kafka, Google PubSub) with retry, batching, and rate limiting. If you're building a job worker that pulls from a queue, you want Broadway, not bare GenStage. Pinterest uses it for Kafka-fed pipelines processing billions of events.
Phoenix.PubSub: Cross-Node Messaging
Phoenix.PubSub lets processes subscribe to topics and receive broadcasted messages. It works across multiple nodes in a cluster, which makes it the standard way to coordinate state changes in distributed Elixir apps.
# in your supervision tree
children = [
{Phoenix.PubSub, name: MyApp.PubSub}
]
# subscribing
Phoenix.PubSub.subscribe(MyApp.PubSub, "user:42")
# now this process will receive any message broadcast on that topic
def handle_info({:user_updated, user}, state) do
# do something
{:noreply, state}
end
# broadcasting from anywhere on the cluster
Phoenix.PubSub.broadcast(MyApp.PubSub, "user:42", {:user_updated, updated})
This is what makes Phoenix LiveView work — every connected user has a process subscribed to topics, and a single broadcast updates all of them. Bleacher Report uses this to push live game updates to millions of mobile clients with one broadcast call per event.
You don't need Phoenix to use PubSub — it's a standalone library, despite the name.
When to Use What
- Plain functions / modules — anything stateless. Don't reach for a process if you don't need one.
- Agent — simple shared state, no behavior beyond get/update. Cache, config, counter.
- GenServer — state plus behavior. Rate limiter, circuit breaker, connection pool, anything with internal logic.
- Task — fire-and-forget concurrent work. Spawning HTTP fetches, parallel computations.
- Task.async_stream — bounded concurrency over an enumerable, I/O bound.
- Registry — process discovery by key. One process per user, per game, per session.
- DynamicSupervisor — supervised processes added at runtime. Pairs with Registry constantly.
- Phoenix.PubSub — cross-process or cross-node fan-out of events.
- Flow — CPU-bound or memory-bound parallel data processing with backpressure.
- Broadway — consuming from a message broker with retry, batching, ack semantics.
- GenStage (raw) — almost never. Use Flow or Broadway instead unless you're building infrastructure.
Common Pitfalls
Reinventing Registry as a GenServer-with-a-Map. Every Elixir codebase eventually has someone write a process registry from scratch before discovering Registry exists. Don't be that person. Registry is faster, concurrent, and battle-tested.
Using Agent for everything. Agent is fine until you need to do anything that isn't a get-or-update. Adding a timer? A handle_info callback? At that point, Agent is a worse GenServer. Just use GenServer from the start if you suspect it'll grow.
DynamicSupervisor without :transient restarts. If a session process dies because the user disconnected, you usually don't want it restarted. Set restart: :transient on the child spec so only abnormal exits trigger restarts.
Confusing Registry's :unique and :duplicate modes. :unique is for "find the one process for this key." :duplicate is for pub-sub-style fan-out. Picking the wrong one usually means re-registering processes and getting {:error, {:already_registered, _}}.
Using GenStage directly when you don't need to. It's lower-level than most code should be. If your problem is "process this list in parallel," use Task.async_stream. If it's "consume from a queue," use Broadway. If it's "ETL pipeline," use Flow.
Key Takeaways
- OTP gives you composable patterns — pick the smallest one that solves your problem.
- Agent is a GenServer for plain state. Registry is process lookup by key. DynamicSupervisor spawns children at runtime.
- The Registry + DynamicSupervisor + GenServer combo is the standard "process per entity" pattern.
- Flow handles parallel data processing with backpressure. Broadway handles message broker consumption. Both sit on GenStage.
- Phoenix.PubSub is the standard cross-node fan-out, not just for Phoenix apps.
- Don't write infrastructure when a library exists. Most "I need to coordinate processes" problems already have a battle-tested solution.