9 min read
On this page

ETS vs GenServer State

Every Elixir developer hits the same fork: I need to share some data across processes. Do I put it in a GenServer or in ETS? The honest answer depends on what you are sharing, how often it is read versus written, and whether you need transactional consistency across multiple keys. There is no universally right choice, but there is a set of patterns that come up in real systems, and knowing which one fits your workload is the difference between a snappy cache and a process queue that pegs one core under load.

The framing that matters most: a GenServer serializes everything. ETS does not. Once you have internalized that, most of the decisions follow.

The GenServer Bottleneck

A GenServer processes one message at a time. That is the whole point — it is what makes its state safe. But it also means that no matter how many processes are calling into it, only one call is being handled at any instant. The rest are queued in the mailbox.

defmodule SlowCache do
  use GenServer

  def start_link(_), do: GenServer.start_link(__MODULE__, %{}, name: __MODULE__)
  def init(state), do: {:ok, state}

  def get(key), do: GenServer.call(__MODULE__, {:get, key})
  def put(key, val), do: GenServer.call(__MODULE__, {:put, key, val})

  def handle_call({:get, key}, _from, state), do: {:reply, Map.get(state, key), state}
  def handle_call({:put, key, val}, _from, state), do: {:reply, :ok, Map.put(state, key, val)}
end

This works. It is also a textbook scalability disaster as a cache. Every reader funnels through one process. The mailbox grows. Latency spikes the moment your concurrent reader count exceeds what one process can chew through. Throughput tops out at whatever one BEAM scheduler can do for a single process — often somewhere in the tens of thousands of operations per second for trivial state, far less if the call payload is large because the data is copied into and out of the GenServer's heap on every call.

For mutable state that needs serialization (a counter, a rate limiter, a connection pool checkout), this is exactly the behaviour you want. For a read-heavy cache, it is wrong.

The ETS Cache Pattern

Now the same cache backed by ETS:

defmodule FastCache do
  use GenServer

  @table :fast_cache

  def start_link(_), do: GenServer.start_link(__MODULE__, nil, name: __MODULE__)

  def get(key) do
    case :ets.lookup(@table, key) do
      [{^key, value}] -> {:ok, value}
      [] -> :error
    end
  end

  def put(key, val), do: GenServer.cast(__MODULE__, {:put, key, val})

  @impl true
  def init(_) do
    :ets.new(@table, [:set, :public, :named_table,
                      read_concurrency: true, write_concurrency: true])
    {:ok, nil}
  end

  @impl true
  def handle_cast({:put, key, val}, state) do
    :ets.insert(@table, {key, val})
    {:noreply, state}
  end
end

Two things changed. First, get/1 no longer talks to the GenServer at all. It reads directly from ETS in the caller's process. A thousand processes can call get/1 in parallel and no one waits. Second, the GenServer still exists — it owns the table and handles writes. Writes funnel through one process for ordering and consistency, but reads do not. The GenServer's mailbox is only ever busy with writes, which are typically much rarer than reads.

This is the canonical ETS-with-GenServer pattern, and you will see it in production library after production library:

  • Phoenix.Endpoint stores its compiled configuration in ETS. The endpoint process owns the table; every request reads from it without involving the endpoint at all.
  • Plug.Session.ETS is exactly this shape — a session store where the writes go through a single process and the per-request reads hit ETS directly.
  • Cachex and ConCache are full libraries built around this pattern, with eviction, TTLs, and lock-striping layered on top.
  • Phoenix.PubSub with the ETS adapter uses tables to track local subscriptions, with broadcast happening from any process.

The reason these libraries exist is that the pattern, while simple, has enough sharp edges (TTL, eviction, hot keys, ownership inheritance) that you do not want to rebuild it every time.

When to Keep State in the GenServer

ETS is not always better. Several situations call for keeping state in the GenServer:

You need atomic multi-key updates. The classic example: move a value from one bucket to another. In a GenServer, this is one function call. In ETS, you have a read, a delete, an insert — and another process can observe an intermediate state where the value is in neither bucket. ETS only gives you atomicity per-key, via update_counter and update_element.

Writes far outnumber reads. A queue, an inbox, a buffer that gets flushed periodically. The serialization the GenServer gives you is the whole point — there is no concurrent-read benefit to chase.

The state is small and only one process needs it. If you have a tiny config map that lives inside a worker, putting it in ETS just adds an indirection. Use the worker's state.

The state has complex invariants that span the whole structure. A finite state machine with twelve states and transitions between them is much easier to reason about as GenServer state. The serialization gives you sequential reasoning. ETS hands you back the burden of thinking about concurrent observers.

You want supervision-level reset on crash. GenServer state vanishes when the process restarts, which is sometimes exactly what you want — a clean slate. ETS data outlives the process unless you destroy the table.

The Lost-Atomicity Problem

The single biggest trap in moving from GenServer state to ETS is losing the actor model's serialization. Two examples make the problem concrete.

Read-modify-write race:

# Process A
[{key, count}] = :ets.lookup(:counters, key)
:ets.insert(:counters, {key, count + 1})

# Process B (interleaved)
[{key, count}] = :ets.lookup(:counters, key)
:ets.insert(:counters, {key, count + 1})

If both processes read count = 5, both compute 6, both insert {key, 6}. One increment is lost. The fix is :ets.update_counter/3, which performs the read-modify-write atomically inside the BEAM:

:ets.update_counter(:counters, key, 1, {key, 0})

Multi-key invariants:

# "Move 10 dollars from alice to bob"
[{:alice, alice_balance}] = :ets.lookup(:balances, :alice)
[{:bob, bob_balance}] = :ets.lookup(:balances, :bob)
:ets.insert(:balances, {:alice, alice_balance - 10})
:ets.insert(:balances, {:bob, bob_balance + 10})

Between the two insert calls, another process can read :balances and see Alice's account debited but Bob's not yet credited. Money has briefly disappeared. There is no ETS primitive that makes this atomic across two keys. The options:

  1. Funnel the transfer through a single process that owns the operation. Reads can still go directly to ETS for everything that does not care about the intermediate state.
  2. Encode both balances in one tuple under one key. Then a single update_element is atomic. This works for fixed pairs but does not generalize.
  3. Use Mnesia or a real database. ETS does not do transactions.

In a real payments system, you would never trust ETS with money anyway. But the pattern shows up in subtler forms: updating a session and its lookup index, incrementing a counter and recording the latest timestamp, moving a queued item from "pending" to "processing". Any time two writes need to be observed together, ETS alone is not enough.

Hybrid Patterns Worth Knowing

In practice, real systems mix GenServers and ETS in a few recurring shapes.

Owner-and-readers. One GenServer owns the table, handles writes, applies eviction. Every other process reads directly. This is what we built above. Best for caches with high read-to-write ratios.

Sharded owners. When write throughput exceeds what one GenServer can handle, you shard. n GenServers each own their own ETS table. Callers route by :erlang.phash2(key, n) to find the right owner. Reads still go straight to ETS (you look up which shard from the key). Cachex does this internally. Discord uses sharded ETS heavily for the gateway state — millions of voice and chat connections per node, partitioned across many tables to keep any single owner from becoming a hot spot.

ETS as backing store for GenServer state. Sometimes a GenServer holds state that occasionally needs to be queried by many other processes. Instead of routing every query through the GenServer, mirror the state into ETS on every update. The GenServer remains the source of truth; ETS is a read-only projection.

def handle_call({:update, user_id, attrs}, _from, state) do
  new_state = update_user(state, user_id, attrs)
  :ets.insert(:user_cache, {user_id, get_in(new_state, [:users, user_id])})
  {:reply, :ok, new_state}
end

Direct-read with versioning. When you absolutely cannot tolerate stale reads, but you also cannot afford to go through a GenServer, store a version number alongside the data. Readers check the version, retry if it changed mid-read. This is a hand-rolled optimistic concurrency scheme. Most people do not need it — the simpler patterns above are usually enough.

A Realistic Comparison

To make the trade-off concrete, here is the same feature — a per-user feature flag store — written both ways.

GenServer version:

defmodule Flags.Server do
  use GenServer

  def start_link(_), do: GenServer.start_link(__MODULE__, %{}, name: __MODULE__)
  def init(state), do: {:ok, state}

  def enabled?(user_id, flag), do: GenServer.call(__MODULE__, {:check, user_id, flag})
  def enable(user_id, flag),   do: GenServer.cast(__MODULE__, {:enable, user_id, flag})

  def handle_call({:check, user_id, flag}, _from, state) do
    {:reply, Map.get(state, {user_id, flag}, false), state}
  end

  def handle_cast({:enable, user_id, flag}, state) do
    {:noreply, Map.put(state, {user_id, flag}, true)}
  end
end

ETS version:

defmodule Flags.Ets do
  use GenServer
  @table :flags

  def start_link(_), do: GenServer.start_link(__MODULE__, nil, name: __MODULE__)

  def enabled?(user_id, flag) do
    case :ets.lookup(@table, {user_id, flag}) do
      [{_, value}] -> value
      [] -> false
    end
  end

  def enable(user_id, flag), do: GenServer.cast(__MODULE__, {:enable, user_id, flag})

  def init(_) do
    :ets.new(@table, [:set, :public, :named_table, read_concurrency: true])
    {:ok, nil}
  end

  def handle_cast({:enable, user_id, flag}, state) do
    :ets.insert(@table, {{user_id, flag}, true})
    {:noreply, state}
  end
end

The code is almost identical. The difference is that under load — say, a Phoenix LiveView dashboard with 5,000 concurrent users where every WebSocket event calls enabled?/2 — the GenServer version queues. The ETS version does not. For a feature-flag check that happens on every request, this is the difference between adding 0.05 ms of latency or 30 ms of latency at the 99th percentile.

Common Pitfalls

Reaching for ETS by reflex. Not every shared state needs it. If you have a single GenServer that handles a few hundred requests per second and the state fits in its heap, leave it alone. ETS adds complexity — ownership, atomicity gotchas, eviction — that is not worth paying for sub-bottleneck workloads.

Forgetting that writes still go through one process. The cache pattern speeds up reads. Writes still funnel through the owner. If your workload is 80% writes and 20% reads, ETS does not help much, and you probably want sharding or a different design altogether.

Mixing direct ETS writes with GenServer-mediated writes. Pick one. If most writes go through the GenServer but a few "fast path" writes go directly, you have two writers racing, and now your "owner" is not really an owner. Consistency falls apart at the seams.

Using GenServer.call to read from ETS. You see this a lot in code reviews: someone wraps the ETS read in a handle_call. The whole point of ETS is to skip the GenServer on reads. If you are calling through, you have the worst of both worlds — the GenServer bottleneck plus the ETS complexity.

Not handling the table's death. If the owner crashes, the table goes with it. Reader processes that cached the table name will get ArgumentError on the next call. Either use heir: to transfer ownership on death, or accept that crashes wipe the cache (often fine — the supervisor restarts the owner, the table is recreated, the cache warms up again).

Key Takeaways

  • A GenServer serializes everything through one mailbox. ETS does not. For read-heavy shared data, ETS is dramatically faster because reads run in parallel without copying.
  • The canonical cache pattern: one GenServer owns the table and handles writes; everyone else reads directly. Phoenix, Plug.Session.ETS, Cachex, and ConCache all do this.
  • ETS gives up atomic multi-key updates. Use update_counter or update_element for atomic single-key writes; funnel through a GenServer for anything that spans keys.
  • Keep state in the GenServer when writes dominate, when invariants span the whole structure, or when you want supervisor-level reset on crash.
  • For very high write throughput, shard across N owners with :erlang.phash2(key, n). Discord runs this pattern at massive scale.
  • Never wrap an ETS read in GenServer.call — that defeats the entire point. Read directly from the caller's process.