5 min read
On this page

The Pipe Operator

The pipe operator |> is a small piece of syntax that changes how you write Elixir more than any other feature. It takes the value on its left and inserts it as the first argument of the function call on its right. That is all it does. But that one rewrite turns nested function calls into top-down data pipelines, and most idiomatic Elixir code is structured around it.

What the Pipe Actually Does

"  hello world  " |> String.trim() |> String.upcase()
# is exactly equivalent to
String.upcase(String.trim("  hello world  "))

The pipe is not a function or a runtime construct. It is a compile-time rewrite. a |> f(b, c) becomes f(a, b, c). Once you know that, nothing about pipes is mysterious.

This is why Elixir's standard library puts the data argument first in nearly every function. Enum.map(list, fun), String.replace(string, pattern, replacement), Map.put(map, key, value). The library is designed around the pipe, and going against the grain — putting the data second — is enough of an antipattern that the community will push back on it during review.

Why It Transforms Readability

Without pipes, function composition reads inside-out. Consider parsing a request, normalizing it, validating it, and persisting it:

# Without pipes — read this from the inside.
persist(validate(normalize(parse(request))))

You read request first, then realize it is being parsed, then normalized, then validated, then persisted. The data flow runs right-to-left while the operations run left-to-right. Your eyes do gymnastics.

With pipes:

request
|> parse()
|> normalize()
|> validate()
|> persist()

Top to bottom. Data flows down. Each step is a transformation. This is what a lot of Elixir programmers mean when they say the language "thinks the way I think."

A Real Pipeline

Here is a snippet close to what you would write in a Phoenix app handling a CSV upload:

defmodule CsvImporter do
  def import(upload) do
    upload.path
    |> File.stream!()
    |> CSV.decode!(headers: true)
    |> Stream.map(&normalize_row/1)
    |> Stream.reject(&invalid?/1)
    |> Stream.chunk_every(500)
    |> Enum.reduce(0, fn batch, count ->
      Repo.insert_all(Customer, batch)
      count + length(batch)
    end)
  end

  defp normalize_row(row) do
    %{
      email: row["email"] |> String.trim() |> String.downcase(),
      name: row["name"] |> String.trim(),
      signed_up_at: parse_date(row["signed_up_at"])
    }
  end

  defp invalid?(%{email: ""}), do: true
  defp invalid?(_), do: false
end

You can read this top to bottom and follow what happens to each row. Open the file, decode CSV, normalize each row, drop the bad ones, batch in groups of 500, insert. No intermediate variables, no temporary names, no lost context.

Pinterest's notification pipeline and Bleacher Report's article ingestion pipeline are both built around this style — long sequences of small functions composed with pipes, often passing structured data through a series of pure transformations before hitting persistence at the end.

Common Enum and Stream Pipelines

Most pipelines you write will go through Enum, Stream, Map, or String. A handful of patterns show up repeatedly.

Cleaning a list of records:

users
|> Enum.filter(& &1.active)
|> Enum.map(& &1.email)
|> Enum.uniq()
|> Enum.sort()

Aggregating into a map:

events
|> Enum.group_by(& &1.user_id)
|> Map.new(fn {user_id, events} -> {user_id, length(events)} end)

String processing:

"  John   Doe  "
|> String.trim()
|> String.split(~r/\s+/)
|> Enum.map(&String.capitalize/1)
|> Enum.join(" ")

Building a query:

User
|> where([u], u.active == true)
|> where([u], u.signed_up_at > ^one_week_ago)
|> order_by([u], desc: u.signed_up_at)
|> limit(50)
|> Repo.all()

This is Ecto's composable query syntax leaning hard on the pipe. Each function takes a queryable and returns a queryable, so you can keep stacking conditions.

When to Break a Pipeline

Pipes are great until they hide too much. A few situations where you should stop piping and assign to a variable:

When the steps are conceptually distinct. If your pipeline is doing two unrelated things, split it.

# Cramming everything into one pipe
upload.path
|> File.stream!()
|> CSV.decode!(headers: true)
|> Stream.map(&normalize_row/1)
|> Enum.to_list()
|> Repo.insert_all(Customer, &1)  # this does not actually work
# Better — read, then write
rows =
  upload.path
  |> File.stream!()
  |> CSV.decode!(headers: true)
  |> Stream.map(&normalize_row/1)
  |> Enum.to_list()

Repo.insert_all(Customer, rows)

When you need the intermediate value twice. A pipe only threads forward. If you want to log the parsed value and also pass it on, break the pipe.

parsed = parse(input)
Logger.debug("parsed: #{inspect(parsed)}")
process(parsed)

When the next function does not take the piped value as the first argument. Wrapping in an anonymous function or using then/2 works, but at that point clarity has left the building.

# Awkward
value |> then(&Map.put(other_map, :key, &1))

# Clearer
Map.put(other_map, :key, value)

When you are not piping anything. Starting a pipe with a literal that is not transformed is a smell.

# Don't do this
1 |> Kernel.+(2)

# Just write this
1 + 2

Some style guides also recommend not starting a pipe with a function call — start with a value or an existing variable. So parse(input) |> validate() becomes:

input
|> parse()
|> validate()

This is opinionated. Credo will flag it under the Credo.Check.Refactor.PipeChainStart rule. Some teams turn it off. The goal is consistency more than the exact rule.

Parens Are Not Optional in Pipes

A function call inside a pipe should always have parentheses, even if it takes no extra arguments.

# Good
"hello" |> String.upcase()

# Avoid
"hello" |> String.upcase

The compiler accepts both, but the second form occasionally creates ambiguity with macros and confuses the formatter. The community settled on parens-required years ago, and mix format will add them for you.

A Preview of with

Pipes work well when each step succeeds. They fall apart when steps return {:ok, value} or {:error, reason} and you want to short-circuit on errors.

# This does not pipe cleanly
{:ok, parsed} = parse(input)
{:ok, validated} = validate(parsed)
{:ok, saved} = save(validated)

For these cases, Elixir has the with special form, which behaves like a pipe that knows how to abort on a non-matching pattern. We will cover it in detail in the next topic on control flow, but a peek:

with {:ok, parsed} <- parse(input),
     {:ok, validated} <- validate(parsed),
     {:ok, saved} <- save(validated) do
  {:ok, saved}
end

If any step returns something other than {:ok, _}, the whole expression returns that value. It is the natural successor to the pipe when error handling enters the picture.

Common Pitfalls

Piping into a function that does not take the piped value first. This is the most common confusion. result |> Map.get(:key) works because Map.get/2 is (map, key). key |> Map.get(map) does not work the way you think.

Starting a pipe with an expression that has side effects. The expression runs once at the top, but new readers sometimes read it as if it were threaded through every step. Move side-effecting calls to before the pipe and bind them to a variable.

Pipes longer than seven or eight steps. Past a certain length, a pipe stops being a pipeline and starts being a wall. Break it apart, give intermediate values names, or extract a function.

Using then/2 to force-pipe values that do not fit. then/2 is useful occasionally, but reaching for it constantly means you are forcing the pipe shape on code that does not want it. Step out of the pipe instead.

Mixing Enum and Stream carelessly. Stream.map(...) |> Enum.to_list() is fine. Enum.to_list(...) |> Stream.map(...) defeats the point of the stream. We will cover the Enum/Stream split in a later topic.

Key Takeaways

  • |> is a compile-time rewrite that injects the left side as the first argument of the right side.
  • Elixir's standard library is designed around pipes — data goes first, almost everywhere.
  • Pipes turn nested calls into top-down pipelines that read like prose.
  • Break the pipe when you need an intermediate value twice, when steps are conceptually distinct, or when the next function does not take the piped value first.
  • Always use parens on functions inside pipes. The formatter will enforce this.
  • For pipelines that need error handling, the with special form picks up where |> leaves off.