Registry Patterns

Registry is the piece that makes "process per entity" actually work. Without it, you would have to track pids by hand in some Agent or GenServer, and the moment one of those processes died you would be looking at a stale pid pointing nowhere. Registry handles naming, lookup, and lifecycle so the supervisor and the worker can stay simple. Phoenix Channels uses it for every subscribed pid. Discord registers tens of millions of gateway processes through it. Once you understand Registry, the rest of OTP composes naturally.

The model is straightforward — Registry is a process that owns one or more ETS tables. Processes register themselves under a key, look up other processes by key, and Registry handles cleanup when a registered process dies. Everything else is variations on that theme.

Why You Can't Just Use Names

The naive answer to "how do I find a process" is Process.register(pid, :my_name) or GenServer.start_link(__MODULE__, arg, name: :something). That works fine for singletons. It falls apart the moment you have many of the same kind of process.

Two reasons. First, names are atoms, and the BEAM atom table is bounded — roughly a million entries before the VM crashes. Registering a process per user with :"user_#{id}" works for a hundred users and bricks your node at a million. Atoms are not garbage collected. Once you create one, it lives forever.

Second, even if atoms were free, you'd have no way to enumerate, partition, or pub-sub over them. The process registry the runtime gives you is a global flat namespace. Registry gives you a structured key space with proper lifecycle management.

# bad — atoms leak, no scaling past tens of thousands
GenServer.start_link(__MODULE__, user_id, name: :"user_#{user_id}")

# good — string key, ETS-backed, no atom pressure
GenServer.start_link(__MODULE__, user_id,
  name: {:via, Registry, {MyApp.UserRegistry, user_id}})

:unique vs :duplicate

Every Registry is one of two shapes, decided at start time. You can't change it later.

:unique — one pid per key. Registering a second process under the same key fails with {:error, {:already_registered, existing_pid}}. This is the shape for "find the process for this entity" — one chat session per room, one worker per user, one game state per game ID.

:duplicate — many pids per key. Registering doesn't conflict; all pids under a key get listed. This is the shape for pub-sub-style fan-out — every subscriber to "topic:42" registers under that key, and you dispatch by iterating the entries.

# unique — process discovery by entity
children = [
  {Registry, keys: :unique, name: MyApp.ChatRoomRegistry}
]

# duplicate — subscriber lists for events
children = [
  {Registry, keys: :duplicate, name: MyApp.EventBus}
]

Picking the wrong one is a top-five Registry mistake. If you find yourself wanting both behaviors, you actually have two Registries. They're cheap enough that running half a dozen in one app is normal.

via Tuples Are the Whole Trick

A "via tuple" is {:via, Module, term}. Anywhere OTP expects a name — GenServer.start_link, GenServer.call, Agent.start_link, DynamicSupervisor child specs, Process.whereis adjacent functions — you can pass a via tuple and the runtime will call Module.whereis_name/1 or Module.register_name/2 to translate the key into a pid.

For Registry, the tuple is {:via, Registry, {RegistryName, key, optional_value}}.

# starting under a via name
GenServer.start_link(MyApp.Room, room_id,
  name: {:via, Registry, {MyApp.RoomRegistry, room_id}})

# calling through the via name
GenServer.call({:via, Registry, {MyApp.RoomRegistry, room_id}}, :get_state)

# starting an Agent the same way
Agent.start_link(fn -> %{} end,
  name: {:via, Registry, {MyApp.CacheRegistry, "user:42"}})

# DynamicSupervisor children get the same treatment via the child's start_link
DynamicSupervisor.start_child(MyApp.RoomSupervisor, {MyApp.Room, room_id})
# where MyApp.Room.start_link uses the via tuple internally

The convention in production code is to wrap the tuple in a private function so callers don't repeat it:

defmodule MyApp.Room do
  use GenServer

  def start_link(room_id) do
    GenServer.start_link(__MODULE__, room_id, name: via(room_id))
  end

  def get_state(room_id), do: GenServer.call(via(room_id), :get_state)
  def post_message(room_id, msg), do: GenServer.cast(via(room_id), {:post, msg})

  defp via(room_id), do: {:via, Registry, {MyApp.RoomRegistry, room_id}}

  # ... callbacks
end

Now every call site just says MyApp.Room.get_state(room_id). The via tuple is an implementation detail the rest of the codebase never sees.

Pub-Sub on Top of Registry

A :duplicate Registry is a lightweight pub-sub. Subscribers register under a topic key, publishers iterate registered pids with Registry.dispatch/3.

defmodule MyApp.EventBus do
  def subscribe(topic) do
    Registry.register(MyApp.EventBus, topic, [])
  end

  def publish(topic, message) do
    Registry.dispatch(MyApp.EventBus, topic, fn entries ->
      for {pid, _value} <- entries, do: send(pid, {:event, topic, message})
    end)
  end
end

The function you pass to dispatch/3 runs in the publisher's process, not the registry's — Registry only hands you the list of {pid, value} pairs. That matters: the dispatch is parallel from the registry's perspective (it's one ETS read), and the cost of sending messages is the publisher's, not the registry's.

This pattern works great for in-process or in-node coordination. For cross-node pub-sub, you want Phoenix.PubSub, which has a distributed implementation. Registry is single-node — entries don't propagate across a cluster.

The other thing Registry pub-sub doesn't do: ordering across topics, persistence, retries. It's a fan-out primitive. Anything richer, you build on top.

Real Example: Chat Rooms Keyed by Room ID

The standard "process per entity" shape. One GenServer per chat room. Lookup by room ID. Spawned on demand under a DynamicSupervisor.

defmodule Chat.Room do
  use GenServer, restart: :transient

  def start_link(room_id) do
    GenServer.start_link(__MODULE__, room_id, name: via(room_id))
  end

  def post(room_id, msg), do: GenServer.cast(via(room_id), {:post, msg})
  def history(room_id), do: GenServer.call(via(room_id), :history)

  defp via(room_id), do: {:via, Registry, {Chat.RoomRegistry, room_id}}

  @impl true
  def init(room_id), do: {:ok, %{room_id: room_id, messages: []}}

  @impl true
  def handle_cast({:post, msg}, state) do
    Chat.EventBus.publish({:room, state.room_id}, {:new_message, msg})
    {:noreply, %{state | messages: [msg | state.messages]}}
  end

  @impl true
  def handle_call(:history, _from, state) do
    {:reply, Enum.reverse(state.messages), state}
  end
end

defmodule Chat.RoomManager do
  def get_or_start(room_id) do
    case DynamicSupervisor.start_child(
           Chat.RoomSupervisor,
           {Chat.Room, room_id}
         ) do
      {:ok, pid} -> {:ok, pid}
      {:error, {:already_started, pid}} -> {:ok, pid}
      error -> error
    end
  end
end

The application boots three things:

children = [
  {Registry, keys: :unique, name: Chat.RoomRegistry},
  {Registry, keys: :duplicate, name: Chat.EventBus},
  {DynamicSupervisor, name: Chat.RoomSupervisor, strategy: :one_for_one}
]

The flow when a user posts a message:

Caller invokes Chat.RoomManager.get_or_start(room_id) to make sure a room process exists.
DynamicSupervisor either starts a new Chat.Room (which registers itself in Chat.RoomRegistry via the via tuple) or returns the already-started pid.
Caller invokes Chat.Room.post(room_id, msg), which resolves the via tuple to a pid through Registry lookup and casts to it.
The room handler stores the message and publishes to Chat.EventBus.
Subscribed user processes receive the broadcast.

Registry is doing two jobs — naming the rooms (:unique) and routing fan-out events (:duplicate) — across two separate Registry instances. That's the pattern.

Operational Tips

Shard the registry under load. A Registry is internally one or more partitions, each backed by its own ETS table. The default partition count is :erlang.system_info(:schedulers_online). For most apps that's fine, but if your registry is a hot path — millions of lookups per second across many cores — bumping partitions can help spread the load.

{Registry, keys: :unique, name: MyApp.HotRegistry, partitions: 64}

Partitions are hash-distributed on the key, so reads and writes for a given key always hit the same partition. You don't get more throughput on a single key — you get more total throughput across the key space.

Monitor the ETS table size. A leaked process — one that registered but never dies — sits in the table forever. In a system spawning millions of dynamic processes, even a small leak compounds. Track :ets.info(table, :size) on your registry tables in production. A creeping size with no corresponding load growth is the smoking gun for a leak.

Registry.count(MyApp.UserRegistry)

is the supported way to read the count without poking ETS directly.

Watch for the registration deadlock. This one bites everyone once. Your GenServer registers itself via name: via(key) in start_link. Inside init/1, you call some helper that does Registry.lookup(MyRegistry, key) to find yourself. The lookup returns [] — Registry hasn't finished registering you yet, because registration completes after init returns. Either don't look yourself up in init, or use self() directly.

A more subtle version: your registry callback (from Registry.start_link/1 with :listeners) sends a message back to the process being registered. That process is still in init/1, blocked on start_link, with a full mailbox of expected events queueing up. Once init returns, those messages get processed in mailbox order, which may not match the order you assumed.

Beware of monitors at scale. Registry uses Process.monitor/1 on every registered process so it can clean up when a process dies. Each monitor is a small allocation, but at ten million processes that adds up. The BEAM handles this fine, but if you're watching memory closely, the per-process accounting includes the monitor refs.

Registry callbacks calling back into the registered process. Don't write a :listeners callback that does a GenServer.call to the process being registered. That process is mid-startup. The call will time out and crash both the listener and the new registrant. Pub-sub from registry events should always be send/2 or GenServer.cast/2 — never a synchronous call to a process that might not be ready.

Looking Up vs Calling Through

You can resolve a Registry key two ways:

# explicit lookup
case Registry.lookup(MyApp.RoomRegistry, room_id) do
  [{pid, _value}] -> GenServer.call(pid, :get_state)
  [] -> {:error, :not_found}
end

# implicit via the via tuple
GenServer.call({:via, Registry, {MyApp.RoomRegistry, room_id}}, :get_state)

The via tuple is cleaner but throws an exit signal if the process doesn't exist (:noproc). The explicit lookup gives you [] to pattern match on. For "find or start" patterns, the explicit form is better — you handle the missing case in user code. For "I know this exists, just call it," the via form is fine.

The cost is essentially the same. Both go through one ETS read.

Common Pitfalls

Mixing :unique and :duplicate semantics in one registry. You can't. Pick one shape per registry. If you need both, run two.

Registering before the registry is started. Registry needs to be in your supervision tree above the things that register with it. A child trying to start with a via tuple pointing at a Registry that hasn't booted yet will crash with :noproc. Order matters.

Forgetting that Registry is single-node. If you have a cluster, registering "user:42" on node A doesn't make it visible to node B. For cross-node lookup, you need a distributed mechanism — Phoenix.PubSub, Horde, or :pg. Registry alone is local.

Using atoms as keys. Defeats the whole point. The whole reason to use Registry is to avoid atomizing dynamic identifiers. Stick to strings, integers, binaries, tuples — anything but atoms for keys that come from outside data.

Holding pids in long-lived state. "I'll cache the pid so I don't have to look it up every time" sounds good until the process restarts and your cache holds a dead pid. Either re-lookup every time (cheap) or monitor the process so you know when to invalidate the cache.

Calling back into the registering process from a Registry listener. A synchronous call from a listener to a still-initializing process will deadlock. Use send/2 or cast/2, not call/2.

Key Takeaways

Registry is the standard way to name and look up processes in Elixir. It uses ETS, so reads and writes are concurrent and fast.
:unique for one-pid-per-key (find a worker by ID). :duplicate for many-pids-per-key (pub-sub subscribers).
The {:via, Registry, {RegistryName, key}} tuple plugs into anywhere OTP accepts a name — GenServer, Agent, DynamicSupervisor children, all of it.
Wrap the via tuple in a private via/1 function so call sites stay clean.
For pub-sub, Registry.dispatch/3 iterates subscribers; the dispatch function runs in the publisher's process. For cross-node, use Phoenix.PubSub instead.
Run multiple Registries if you need both unique and duplicate semantics — they're cheap.
Operational hygiene: track the table size, partition for hot paths, never make synchronous calls from registry callbacks to the process being registered.