8 min read
On this page

ETS Basics

ETS — Erlang Term Storage — is the escape hatch from the process model. Everywhere else in Elixir, sharing data means sending messages, which means copying. ETS gives you an in-memory key/value table that any process on the node can read or write directly, with no copying on the read path when you ask for it. It is how Phoenix keeps its endpoint config, how Discord caches gateway state, and how libraries like Cachex and ConCache build local caches that survive the death of any single process.

The mental model worth holding onto from the start: an ETS table is not owned by a process the way state is owned by a GenServer. The table lives in the BEAM, outside any process heap. A process creates it and is recorded as the owner — if that process dies, the table dies with it by default — but reads and writes do not go through the owner. They go straight to the table. That is the whole point.

Why ETS Exists

Process isolation is the BEAM's superpower and its tax. When you GenServer.call/2 to fetch a value, the value is copied from the GenServer's heap into the caller's heap. For a small map, this is invisible. For a 50 KB cached HTML fragment fetched 10,000 times per second, the copying dominates everything else, and the GenServer itself becomes a single-threaded bottleneck because every call serializes through one mailbox.

ETS sidesteps both costs. Reads can happen in parallel from any number of processes. With read_concurrency: true, the table is laid out so concurrent readers do not contend on internal locks. The data is read directly from the shared table memory — for some operations no copy happens at all, for others the term is copied once into the calling process's heap. Either way, there is no message round-trip and no serialization point.

# Create a table named :my_cache
:ets.new(:my_cache, [:set, :public, :named_table, read_concurrency: true])

# Any process can write
:ets.insert(:my_cache, {"user:42", %{name: "Alice", age: 30}})

# Any process can read
[{"user:42", user}] = :ets.lookup(:my_cache, "user:42")

That is the whole API in three lines. There is no GenServer in front of it, no message passing, no copying on the call path. Both insert and lookup are constant-time on a :set table.

The Four Table Types

ETS supports four table types, and the choice has consequences for both semantics and performance.

:set is the default. One value per key. Inserting with an existing key overwrites. O(1) lookup and insert. This is what you want 90% of the time.

:ordered_set keeps keys in term order. You give up O(1) lookup (it becomes O(log n)) but gain ordered traversal and range queries. Use it when you actually need to iterate keys in order — a leaderboard, a time-sorted event log, a sorted index. If you do not need ordering, do not pay for it.

:bag allows multiple values per key, but each {key, value} tuple must be unique. Inserting the same tuple twice is a no-op. Lookup returns all values for that key.

:duplicate_bag is like :bag but allows fully duplicate tuples. Same key, same value, multiple entries. Useful for event logs where you really do want every insert recorded, even identical ones.

A concrete comparison: tracking which roles a user has.

# :set — one role per user (overwrites)
:ets.new(:user_role, [:set, :named_table])
:ets.insert(:user_role, {"alice", :admin})
:ets.insert(:user_role, {"alice", :editor})  # overwrites
:ets.lookup(:user_role, "alice")
# [{"alice", :editor}]

# :bag — multiple roles per user, no duplicates
:ets.new(:user_roles, [:bag, :named_table])
:ets.insert(:user_roles, {"alice", :admin})
:ets.insert(:user_roles, {"alice", :editor})
:ets.insert(:user_roles, {"alice", :editor})  # ignored
:ets.lookup(:user_roles, "alice")
# [{"alice", :admin}, {"alice", :editor}]

# :duplicate_bag — every insert kept
:ets.new(:login_log, [:duplicate_bag, :named_table])
:ets.insert(:login_log, {"alice", ~U[2026-01-01 10:00:00Z]})
:ets.insert(:login_log, {"alice", ~U[2026-01-01 10:00:00Z]})
:ets.lookup(:login_log, "alice")
# [{"alice", ~U[...]}, {"alice", ~U[...]}]

In practice, :set covers most cases, :ordered_set covers anything sorted, and the bag variants come up rarely outside of inverted indexes and event logging.

Access Modes

The second knob is who can read and write the table.

:public — any process can read and write. The simplest and most common choice for caches and shared data.

:protected — the default. The owner can read and write. Everyone else can only read. Useful when one process should be the sole writer (a GenServer that owns the table) but readers are unrestricted.

:private — only the owner can touch the table at all. Rarely useful, since the same effect is achievable with regular process state.

The cache pattern that shows up everywhere in real Elixir apps is: a GenServer creates a :public table, handles writes itself for consistency, and lets every other process read directly. The next topic covers this in detail, but it is worth noting now because it is the dominant shape of ETS usage in production.

:ets.new(:shared_cache, [:set, :public, :named_table, read_concurrency: true])
:ets.new(:read_only,    [:set, :protected, :named_table])
:ets.new(:internal,     [:set, :private])

The Core API

You can build almost anything from five functions: new, insert, lookup, delete, and one of the update primitives.

:ets.new/2 creates a table and returns either a table identifier or the registered name if you pass :named_table. Named tables let you reference the table by atom name from any process, which is the typical pattern.

table = :ets.new(:cache, [:set, :public, :named_table, read_concurrency: true])
# table is :cache (the atom) because :named_table was passed

:ets.insert/2 writes a tuple. The first element of the tuple is the key by default; you can change which position is the key with the keypos option, though almost nobody does. Insert is destructive on :set — the existing value is replaced.

:ets.insert(:cache, {"key1", "value1"})
:ets.insert(:cache, [{"key2", "value2"}, {"key3", "value3"}])  # batch

:ets.lookup/2 returns a list of matching tuples. For :set and :ordered_set, the list has zero or one element. For :bag and :duplicate_bag, it can have many.

case :ets.lookup(:cache, "key1") do
  [{"key1", value}] -> {:ok, value}
  [] -> :error
end

:ets.delete/2 removes all tuples for a key. :ets.delete/1 destroys the whole table.

:ets.delete(:cache, "key1")
:ets.delete(:cache)  # destroys the table

For atomic in-place mutations, ETS provides two specialized primitives. :ets.update_counter/3 atomically increments an integer in a tuple position — useful for hit counters, rate limiters, and any monotonically increasing value. :ets.update_element/3 atomically replaces specific positions in an existing tuple without copying through application code.

:ets.insert(:counters, {"hits", 0})
:ets.update_counter(:counters, "hits", 1)
# 1
:ets.update_counter(:counters, "hits", 1)
# 2

:ets.insert(:users, {"alice", "alice@example.com", :active})
:ets.update_element(:users, "alice", {3, :suspended})

These two are the only safe way to do "read-modify-write" against ETS from multiple processes. Anything else — lookup, mutate in Elixir, insert — races, because another process can write between your lookup and your insert. We will come back to this in the next topic.

ETS Lives Outside the Process Model

This is the part that surprises people coming from a pure actor-model mindset. An ETS table is not a process. It does not have a mailbox. It is not scheduled. It is a chunk of memory inside the BEAM with a hashtable (or tree, for :ordered_set) and a set of locks.

That means:

  • There is no message round-trip. A lookup is a function call into the BEAM, not a send and a receive.
  • The table outlives any single process only if you ask it to. By default, an ETS table is destroyed when its owner dies. Pass heir: {pid, term} or call :ets.give_away/3 to transfer ownership before death.
  • ETS does not respect the actor model's serialization guarantees. Two processes can write to the same key concurrently, and only one write wins. If you need atomicity, you use the atomic primitives or you funnel writes through a single process.

This last point is the one that bites people. The mental shortcut "Elixir state is safe because processes serialize access" stops applying the moment you touch ETS. Concurrent writes need either atomic operations or a write-serializing owner.

A Small Worked Example

A page view counter, the kind every analytics service needs:

defmodule PageViews do
  @table :page_views

  def setup do
    :ets.new(@table, [:set, :public, :named_table,
                      read_concurrency: true, write_concurrency: true])
  end

  def record(path) do
    :ets.update_counter(@table, path, 1, {path, 0})
  end

  def count(path) do
    case :ets.lookup(@table, path) do
      [{^path, n}] -> n
      [] -> 0
    end
  end

  def top(n) do
    @table
    |> :ets.tab2list()
    |> Enum.sort_by(fn {_, count} -> count end, :desc)
    |> Enum.take(n)
  end
end

Every request handler in a Phoenix app can call PageViews.record(conn.request_path) concurrently. :ets.update_counter/4 is atomic — the fourth argument is the default tuple to insert if the key does not exist. No GenServer in the hot path, no serialization point, no copying on the increment.

tab2list/1 reads the whole table. For a page view counter that is fine; for anything with millions of rows, you would use :ets.foldl/3 or :ets.select/2 with a match specification instead. We will not go deep on match specs here — they are powerful but their syntax is a topic of its own.

Common Pitfalls

Treating ETS like a process and worrying about race conditions on reads. ETS reads are safe to do concurrently from any number of processes. There is no need for synchronization on the read path. Locks exist inside ETS, but they are fine-grained and invisible to you.

Doing read-modify-write without atomicity. :ets.lookup/2 followed by :ets.insert/2 is not atomic. Two processes can both lookup the same value, both compute a new value from it, and both insert — one of the writes is silently lost. Use :ets.update_counter/3 or :ets.update_element/3 when you can, and funnel through a single GenServer when you cannot.

Forgetting that the owner's death kills the table. A common bug: spawn a setup task that creates the table and dies. The table dies with it. Either create the table from a long-lived process (your GenServer, your Application supervisor's child) or pass heir: {pid, data} so ownership transfers on death.

Skipping read_concurrency: true on read-heavy tables. It is not the default. For a cache that gets read thousands of times per second across many processes, the difference is measurable. Set it. The cost is slightly slower writes, which usually does not matter for cache workloads.

Using :ets.tab2list/1 on large tables. It copies every tuple into the calling process. On a table with a million entries, this is a multi-megabyte heap allocation and a pause. Use :ets.foldl/3, :ets.select/2, or :ets.first/1 + :ets.next/2 for streaming traversal.

Key Takeaways

  • ETS is a shared, in-memory key/value table managed by the BEAM, outside the process model.
  • Four table types: :set (unique keys, O(1)), :ordered_set (sorted keys, O(log n)), :bag (multiple values per key, no duplicates), :duplicate_bag (multiple, duplicates allowed).
  • Access modes are :public, :protected (default), and :private. Cache patterns almost always want :public with read_concurrency: true.
  • The core API is :ets.new/2, :ets.insert/2, :ets.lookup/2, :ets.delete/1 and /2. Atomic mutations use :ets.update_counter/3 and :ets.update_element/3.
  • An ETS table is owned by a process and dies when the owner dies, unless you transfer ownership with heir: or :ets.give_away/3.
  • Reads do not go through any process. Writes do not either — but two processes writing the same key race unless you use the atomic primitives.