6 min read
On this page

Enum and Stream

Enum is the workhorse of Elixir collection processing. Almost every nontrivial program reaches for Enum.map, Enum.filter, or Enum.reduce constantly. Stream is its lazy cousin — same API, different evaluation model. The choice between them is rarely about performance; it is about whether you can or want to materialize an entire collection in memory.

Enum: Eager Evaluation

Enum functions take a collection, do work, and return a new collection. They are eager. Every step produces a complete result.

[1, 2, 3, 4, 5]
|> Enum.map(&(&1 * 2))
|> Enum.filter(&(&1 > 4))
|> Enum.sum()
# 24

After map, you have [2, 4, 6, 8, 10] — a real list in memory. After filter, you have [6, 8, 10]. After sum, you have 24. Three intermediate values existed simultaneously somewhere on the heap.

For small or medium collections, this is fine. The garbage collector reclaims the intermediates almost immediately, and the code reads naturally.

The Functions You Will Use Constantly

A handful of Enum functions cover most of what you need.

map transforms each element:

Enum.map([1, 2, 3], &(&1 + 10))  # [11, 12, 13]
Enum.map(users, & &1.email)

filter keeps elements that satisfy a predicate:

Enum.filter([1, 2, 3, 4], &(&1 > 2))  # [3, 4]
Enum.filter(users, & &1.active)

reduce folds a collection into a single value:

Enum.reduce([1, 2, 3, 4], 0, fn x, acc -> acc + x end)  # 10

Enum.reduce(orders, %{}, fn order, acc ->
  Map.update(acc, order.user_id, order.amount, &(&1 + order.amount))
end)

reduce is the most general operation. Almost every other Enum function can be expressed as a reduce. When the function you want does not exist in Enum, the fallback is to write a reduce by hand.

group_by buckets elements by a key function:

Enum.group_by(users, & &1.role)
# %{admin: [...], user: [...], guest: [...]}

Enum.group_by(orders, & &1.user_id, & &1.amount)
# %{1 => [100, 200], 2 => [50]}

The third argument transforms each element before grouping. Useful when you only need a subset of the original data.

chunk_every splits into chunks:

Enum.chunk_every([1, 2, 3, 4, 5, 6, 7], 3)
# [[1, 2, 3], [4, 5, 6], [7]]

Enum.chunk_every(records, 1000, 1000, :discard)
# Drop trailing chunks that don't fill

This is what you reach for when you need to batch insert into a database, send paginated requests, or process work in fixed-size groups.

flat_map maps and concatenates:

Enum.flat_map([1, 2, 3], &[&1, &1 * 10])
# [1, 10, 2, 20, 3, 30]

Enum.flat_map(users, & &1.tags)
# All tags across all users, flattened

uniq and uniq_by remove duplicates:

Enum.uniq([1, 2, 2, 3, 1])  # [1, 2, 3]
Enum.uniq_by(users, & &1.email)

sort and sort_by:

Enum.sort([3, 1, 2])  # [1, 2, 3]
Enum.sort_by(users, & &1.signed_up_at, :desc)

count and find:

Enum.count(users, & &1.active)
Enum.find(users, & &1.email == "alice@example.com")

any? and all?:

Enum.any?(users, & &1.admin?)
Enum.all?(records, &valid?/1)

This list is far from complete — Enum has around 80 functions — but you will use these every day.

Stream: Lazy Evaluation

Stream is the lazy version of Enum. Stream functions do not do any work. They build up a description of work to do, and the work runs only when something forces it.

[1, 2, 3, 4, 5]
|> Stream.map(&(&1 * 2))
|> Stream.filter(&(&1 > 4))
|> Enum.sum()
# 24

This produces the same answer, but the intermediate lists never exist. Stream.map returns a Stream struct that says "when asked for an element, multiply by 2." Stream.filter wraps that and says "when asked for an element, ask the upstream and only yield ones greater than 4." Nothing actually happens until Enum.sum/1 starts pulling values through.

This matters in three situations:

Large collections that would not fit in memory at intermediate stages. Reading a 5 GB log file, mapping each line, filtering, and counting works fine with streams because no intermediate list ever exists.

Infinite collections. Stream.iterate/2, Stream.cycle/1, and Stream.repeatedly/1 produce streams without an end. You only consume what you need.

Pipelines that should be interleaved. When you want each element to flow through every step before the next element starts, rather than each step running to completion before the next starts.

A Realistic Stream Example

Reading a large log file and counting errors:

"large_log.txt"
|> File.stream!()
|> Stream.map(&String.trim/1)
|> Stream.filter(&String.contains?(&1, "ERROR"))
|> Stream.map(&parse_log_line/1)
|> Stream.reject(&is_nil/1)
|> Enum.reduce(%{}, fn entry, acc ->
  Map.update(acc, entry.service, 1, &(&1 + 1))
end)

File.stream!/1 returns a stream of lines. Even though the file might be 10 GB, only one line is in memory at a time as it flows through the pipeline. The final Enum.reduce/3 is what triggers the work — every prior Stream.* was just describing the pipeline.

Pinterest's notification system uses this pattern when running batch jobs against millions of users — stream from the database in chunks, transform lazily, push to a queue. The whole pipeline never holds more than one batch in memory.

When to Use Stream vs Enum

The honest answer: most of the time, use Enum. Streams have overhead — building the closure-chain, calling through layers of laziness — that for small collections is more expensive than just doing the eager work. A million-element list is well within Enum's comfort zone on modern hardware.

Switch to Stream when:

  • The collection is large enough that intermediate results would matter (think hundreds of thousands of elements with several map/filter steps).
  • The collection is infinite or unbounded.
  • The source itself is a stream (File.stream!, IO.stream, Ecto.Repo.stream).
  • Early termination matters — you only need the first 10 elements that match, but you would have to map over millions to find them.

For a typical Phoenix request handling 50 records from a database, Enum is the right call.

Composing Pipelines

Both Enum and Stream are designed for the pipe operator. The shape is always the same: collection on the left, transformation on the right.

orders
|> Enum.filter(& &1.status == :paid)
|> Enum.group_by(& &1.user_id)
|> Enum.map(fn {user_id, orders} ->
  total = orders |> Enum.map(& &1.amount) |> Enum.sum()
  {user_id, total}
end)
|> Enum.into(%{})
|> Map.take(top_user_ids)

You can mix Stream and Enum in the same pipeline. The convention is Stream for the parts where laziness helps and Enum to terminate.

1..1_000_000
|> Stream.map(&(&1 * &1))
|> Stream.filter(&(rem(&1, 7) == 0))
|> Enum.take(10)

Stream.map and Stream.filter build up the lazy chain. Enum.take(10) pulls until it has 10 results, then stops. Without streams, you would compute a million squares and a million-ish filtered values just to throw most of them away.

into and Collectable

Enum.into/2 and Stream.into/2 collect a stream of values into a structure that implements the Collectable protocol.

Enum.into([{:a, 1}, {:b, 2}], %{})  # %{a: 1, b: 2}
Enum.into(["a", "b"], MapSet.new())  # MapSet with "a" and "b"

This is how you exit a pipeline into a non-list shape. Common targets: maps, MapSets, IO streams, file streams.

Common Pitfalls

Reaching for Stream for performance reasons on small collections. For a 100-element list, Stream.map(...) |> Stream.filter(...) |> Enum.to_list() is slower than Enum.map(...) |> Enum.filter(...). The closure overhead beats the saved allocations.

Forgetting that streams need a terminator. A stream is just a description of work. Until something like Enum.to_list/1, Enum.reduce/3, or Enum.into/2 consumes it, no work happens. Returning a stream from a function and never consuming it is a common bug — you think you did something but actually built a thunk.

Using Enum.map followed by Enum.filter when you could use Enum.filter first. Order matters when one operation is much more expensive than the other. Filter cheap, map expensive — that way you only do expensive work on elements that pass the filter. The compiler does not reorder these for you.

Calling Enum.reduce when a more specific function exists. Enum.sum/1, Enum.min/1, Enum.max_by/2, Enum.group_by/3, and dozens of others express common reductions clearly. Reach for them before writing a custom reduce.

Treating Stream.map like Task.async. Streams are lazy but sequential. They run on the calling process. For parallel processing, you want Task.async_stream/3, which we will cover in the topic on tasks.

Calling Enum.into(%{}, ...) when the source is already pairs. Enum.into([{:a, 1}], %{}) is the common form, but if you see Enum.into(stream, %{}, fn x -> {x.id, x} end), that third argument is a transform function and is often a clearer way to build a map than mapping then into.

Key Takeaways

  • Enum is eager. Every step produces a real intermediate collection. Use it for normal-sized work.
  • Stream is lazy. It builds a pipeline description that runs only when consumed. Use it for large or infinite collections.
  • The functions you will use constantly: map, filter, reduce, group_by, chunk_every, flat_map, uniq_by, sort_by.
  • A Stream pipeline must end in something eager — Enum.to_list, Enum.reduce, Enum.into, Enum.take — or no work happens.
  • Mix Stream and Enum freely. Common shape: stream the source, lazily transform, terminate with an Enum call.
  • For small collections, Enum is usually faster than Stream. Performance is not the reason to reach for streams — memory shape and laziness are.