7 min read
On this page

Multiple Generators, into, and reduce

A comprehension with a single generator is essentially a fancy Enum.map. The features that make for worth its keywords are the ones a single-generator example does not show: multiple generators that produce Cartesian products, the :into option that collects into any Collectable, and the :reduce option that turns the whole construct into a stateful fold. Each of these has a clear use case that is genuinely awkward to write any other way.

This is also where comprehensions start to feel less like list-processing and more like a small declarative DSL for "describe the collection I want."

Multiple Generators

Two or more generators in a single for produce a Cartesian product. Every combination of values is yielded to the body.

for x <- [1, 2, 3], y <- [:a, :b], do: {x, y}
# [{1, :a}, {1, :b}, {2, :a}, {2, :b}, {3, :a}, {3, :b}]

The rightmost generator advances fastest, the leftmost slowest — same nesting order as a nested for loop in C. If you mentally translate it as a loop-inside-a-loop, you will not be far off:

for x in [1, 2, 3]:
    for y in [:a, :b]:
        yield (x, y)

The body runs once per combination, and the results are flattened into a single list.

Filters can be placed between generators, and they apply to whatever is bound at that point.

for x <- 1..10,
    rem(x, 2) == 0,
    y <- 1..10,
    rem(y, 3) == 0,
    do: {x, y}

The filter rem(x, 2) == 0 runs once per x — when it rejects an x, the inner generator never iterates for that x. This is the same short-circuiting that makes nested loops with early continue efficient. The compiler does this for you; you do not need to think about it.

Building a Board State

A textbook use of multiple generators is generating a coordinate grid. Useful for game boards, image grids, tile maps, any 2D space.

defmodule Board do
  def empty(width, height) do
    for x <- 0..(width - 1),
        y <- 0..(height - 1),
        into: %{} do
      {{x, y}, :empty}
    end
  end
end

Board.empty(3, 2)
# %{
#   {0, 0} => :empty, {0, 1} => :empty,
#   {1, 0} => :empty, {1, 1} => :empty,
#   {2, 0} => :empty, {2, 1} => :empty
# }

The into: %{} option (more on that in a moment) collects the {key, value} tuples directly into a map. The result is a coordinate-keyed grid you can update with Map.put/3 and look up with Map.get/2.

A more realistic version where the initial state depends on coordinates — say, a chess board:

def chess_board do
  for file <- ~w(a b c d e f g h)a,
      rank <- 1..8,
      into: %{} do
    {{file, rank}, piece_for(file, rank)}
  end
end

Sixty-four entries, generated declaratively, no helper recursion or Enum.reduce with two layers.

The :into Option

Every comprehension can take an :into option that says "instead of collecting into a list, collect into this." The right side must implement the Collectable protocol — which is a small list of built-in types and a hook for your own modules.

for word <- ~w(elixir erlang otp beam), into: MapSet.new(), do: String.upcase(word)
# MapSet.new(["BEAM", "ELIXIR", "ERLANG", "OTP"])

Common collectable targets:

  • %{} — collects {key, value} tuples into a map. Duplicate keys overwrite.
  • MapSet.new() — collects into a set.
  • An existing map, %{count: 0} — adds to the map, overwriting keys.
  • A file or IO stream — writes each value to the file/handle.
  • "" (a binary) — concatenates string values.
  • <<>> — accumulates a bitstring (used for the binary comprehensions covered next).
File.open!("output.txt", [:write])
|> then(fn handle ->
  for line <- lines, into: IO.stream(handle, :line) do
    String.trim(line) <> "\n"
  end
end)

You can build a CSV, write logs, or stream to any IO device with the same syntax you used for the in-memory case. Collectable is one of the protocols (covered in the topic on protocols and behaviours) and is the mechanism that lets for adapt to whatever container you want.

For most code, into: %{} and into: MapSet.new() are the two you will reach for constantly.

Building Lookup Maps From CSV Rows

A concrete and common use: you parse a CSV (or a JSON file, or any stream of records) and want to turn it into an id-keyed map.

defmodule Inventory do
  def by_sku(rows) do
    for %{"sku" => sku, "name" => name, "price_cents" => price} <- rows,
        price_int = String.to_integer(price),
        into: %{} do
      {sku, %{name: name, price_cents: price_int}}
    end
  end
end

A few things are happening at once:

  • The pattern in the generator filters out rows missing those keys.
  • The price_int = ... filter runs the parse and binds the result. (Any binding expression is a valid filter — it always returns truthy, but it makes price_int available below.)
  • The do: body builds a {key, value} tuple per row.
  • into: %{} collects directly into a map instead of producing a list of tuples.

The equivalent in Enum:

rows
|> Enum.filter(&match?(%{"sku" => _, "name" => _, "price_cents" => _}, &1))
|> Enum.map(fn %{"sku" => sku, "name" => name, "price_cents" => p} ->
  {sku, %{name: name, price_cents: String.to_integer(p)}}
end)
|> Enum.into(%{})

Three steps, with the destructure repeated. The comprehension is the version most reviewers will prefer.

The :reduce Option

Sometimes the result of a comprehension is not a collection at all — it is a single accumulated value. The :reduce option turns a comprehension into a reduce fold while keeping the generator/filter syntax intact.

for x <- 1..10, reduce: 0 do
  acc -> acc + x
end
# 55

The do block becomes a clause set where the left side is the current accumulator, and the right side is the new accumulator. The initial value is whatever you pass to :reduce.

When does this beat Enum.reduce/3? When the generator side has filters or destructures that would be ugly to write inside an Enum.reduce callback.

Counting tag occurrences in a list of WhatsApp message structs:

for %{type: :tagged, tag: tag} <- messages,
    tag in [:urgent, :followup, :resolved],
    reduce: %{urgent: 0, followup: 0, resolved: 0} do
  acc -> Map.update!(acc, tag, &(&1 + 1))
end

The pattern in the generator drops anything that is not a tagged message. The filter restricts to the three tags we care about. The reduce body just bumps a counter. With Enum.reduce, the same logic would require a destructure inside the callback or a case:

Enum.reduce(messages, %{urgent: 0, followup: 0, resolved: 0}, fn
  %{type: :tagged, tag: tag}, acc when tag in [:urgent, :followup, :resolved] ->
    Map.update!(acc, tag, &(&1 + 1))
  _, acc ->
    acc
end)

Two clauses inside the callback, including an explicit fall-through. The for version pushes the filtering up to where it reads more naturally.

Another example — bucketing requests by status code from an HTTP load test on a Phoenix LiveView endpoint:

for {:ok, %{status: status}} <- results,
    reduce: %{} do
  acc -> Map.update(acc, status, 1, &(&1 + 1))
end

If you find yourself reaching for Enum.reduce immediately after Enum.filter or Enum.flat_map with a destructure, ask whether the whole pipeline is cleaner as a for ... reduce:.

reduce Versus into

These two options collect results differently and serve different needs.

  • :into builds up a collection by inserting each result. Every iteration adds something to the container.
  • :reduce runs a fold with explicit accumulator handling. Each iteration sees the previous accumulator and decides what to do with it.

If you want a map keyed by some derived value, :into is usually right:

for user <- users, into: %{} do
  {user.id, user}
end

If you want a map where each key's value is aggregated across multiple inputs (a count, a sum, a list of values), :reduce is usually right:

for order <- orders, reduce: %{} do
  acc -> Map.update(acc, order.user_id, order.amount, &(&1 + order.amount))
end

The first overwrites duplicates. The second merges them. Picking the wrong one is a subtle bug — it compiles, runs, and silently drops data.

You cannot use :into and :reduce in the same comprehension. They are mutually exclusive.

Common Pitfalls

Treating multiple generators as parallel iteration. They are not zipped. for x <- [1, 2], y <- [3, 4], do: {x, y} produces four pairs, not two. For zip-style pairing, use Enum.zip/2 and a single generator.

Forgetting that :into overwrites map keys. If your generator produces two tuples with the same key, the later one wins. If that matters, use :reduce and explicitly merge, or use Enum.group_by/3.

Using :reduce when Enum.reduce is clearer. The :reduce option earns its keep when the generator side does filtering or destructuring. For a plain reduction over a single collection with no filtering, Enum.reduce/3 is more idiomatic.

Mixing up the clause syntax inside :reduce. The body is acc -> new_acc, not acc, x -> new_acc. The current element is whatever the generator bound (e.g. x in for x <- ..., reduce: 0). Only the accumulator appears on the left of the arrow.

Forgetting that into: <<>> requires every body result to be a bitstring. If you say into: <<>> and the body returns an integer, you get a compile error or a runtime crash depending on the call site. Binary comprehensions have their own dedicated mode, covered next.

Building a giant Cartesian product unintentionally. for x <- big_list, y <- another_big_list, do: ... iterates length(big_list) * length(another_big_list) times. With two 10,000-element lists, that is 100 million iterations. If you really want every pair, accept the cost; if you wanted zip, you wrote the wrong code.

Key Takeaways

  • Multiple generators produce a Cartesian product. The rightmost advances fastest.
  • Filters between generators apply at the level they appear, short-circuiting the inner generators when they reject.
  • :into collects into any Collectable — most commonly %{}, MapSet.new(), a file stream, or <<>>.
  • :reduce turns the comprehension into a fold. The body is acc -> new_acc, with the current generator binding available implicitly.
  • Use :into when each iteration adds a fresh entry; use :reduce when each iteration needs to inspect or merge with the running accumulator.
  • :into and :reduce are mutually exclusive in a single comprehension.
  • Two-dimensional grids, lookup maps, and bucketed counts are the canonical wins for these options.