Queries and Migrations

Ecto's query DSL is one of those things that looks weird at first — from u in User, where: u.age > 18 — and then six months in you wonder why every other ORM hides SQL behind method chains. The DSL maps directly to SQL: a from is a FROM, a where is a WHERE, a join is a JOIN. There's no hidden behavior. What you write is what runs.

Migrations are the other half of the story. They're how your schema changes get from "works on my machine" to production without dropping rows or locking tables for an hour.

The from Macro

Most queries start with from. It takes a binding (a variable to refer to rows by) and a series of clauses.

import Ecto.Query

query = from u in User,
  where: u.age >= 18,
  where: u.confirmed == true,
  order_by: [desc: u.inserted_at],
  limit: 50,
  select: u.email

emails = Repo.all(query)

u in User binds u to a row of the users table. The clauses can reference u.field — Ecto checks at compile time that field exists on the schema. Misspell email and you get a clear error before the query ever runs.

You can chain multiple where clauses; they AND together. Use or_where to OR them, or build a single where with explicit operators.

from u in User, where: u.age >= 18 and (u.admin or u.confirmed)

select lets you pick what comes back. Without it, you get full structs. With it, you get whatever you specified — a single field, a tuple, a map, or a list.

# Just emails
from u in User, select: u.email

# Tuples of (id, email)
from u in User, select: {u.id, u.email}

# Maps with renamed keys
from u in User, select: %{user_id: u.id, contact: u.email}

This is faster than fetching full structs and stripping fields in Elixir, especially for large result sets.

Joins and Preloading

A join adds another table to the query. You bind a new variable for it.

from u in User,
  join: c in Company, on: c.id == u.company_id,
  where: c.name == "Acme",
  select: u

There are different join types: join (default inner), left_join, right_join, cross_join, full_join.

For loading associations, you usually don't write joins yourself — you use preload. Preload runs a separate query (or appends to the current one) to load the associated rows.

# Two queries: one for users, one for their posts
users = Repo.all(from u in User, preload: :posts)

# Single query with a join, posts populated from that join
users = Repo.all(
  from u in User,
    join: p in assoc(u, :posts),
    preload: [posts: p]
)

Two queries (the default preload) is usually fine and easier on Postgres than a giant join. Use the join-and-preload form when you want to filter the parent based on the association — for instance, "users who have a post tagged 'elixir'."

Nested preloads work too:

Repo.all(from u in User, preload: [posts: [:comments, :tags]])

That loads users, their posts, and each post's comments and tags. Without preload, accessing user.posts returns %Ecto.Association.NotLoaded{} — Ecto refuses to silently fire a query, which is what causes N+1 disasters in other ORMs.

Dynamic Queries

Real apps build queries from runtime conditions: search filters, sort options, pagination. The pattern is to start with a base query and pipe modifications onto it.

def list_users(filters) do
  User
  |> filter_by_age(filters[:min_age])
  |> filter_by_company(filters[:company_id])
  |> sort_by(filters[:sort] || :inserted_at)
  |> Repo.all()
end

defp filter_by_age(query, nil), do: query
defp filter_by_age(query, age), do: from u in query, where: u.age >= ^age

defp filter_by_company(query, nil), do: query
defp filter_by_company(query, id), do: from u in query, where: u.company_id == ^id

defp sort_by(query, :name), do: from u in query, order_by: u.name
defp sort_by(query, :inserted_at), do: from u in query, order_by: [desc: u.inserted_at]

The ^ operator (the "pin") interpolates an Elixir value into the query as a parameter. Ecto sends it as a SQL bind variable, so there's no SQL injection risk. Forgetting the pin causes a compile error — Ecto won't let you accidentally embed a literal.

For more dynamic conditions, dynamic/2 builds query fragments you can compose:

defp build_where(filters) do
  Enum.reduce(filters, dynamic(true), fn
    {:min_age, age}, acc -> dynamic([u], ^acc and u.age >= ^age)
    {:company_id, id}, acc -> dynamic([u], ^acc and u.company_id == ^id)
    _, acc -> acc
  end)
end

def list_users(filters) do
  where = build_where(filters)
  Repo.all(from u in User, where: ^where)
end

This is where the DSL earns its weight: composing query pieces from runtime data without resorting to string concatenation.

Aggregates and Group By

# Count
Repo.aggregate(User, :count)
Repo.aggregate(from(u in User, where: u.confirmed), :count)

# Sum, avg, min, max
Repo.aggregate(Order, :sum, :total)

# Group by, return counts per company
from(u in User, group_by: u.company_id, select: {u.company_id, count(u.id)})
|> Repo.all()

Aggregates are dramatically faster than Repo.all |> length() when you only need a count. Always use them when you can.

Migrations

Migrations are versioned schema changes. You generate one with a Mix task, edit it, and run it.

mix ecto.gen.migration add_users

That creates priv/repo/migrations/20260425120000_add_users.exs:

defmodule MyApp.Repo.Migrations.AddUsers do
  use Ecto.Migration

  def change do
    create table(:users) do
      add :email, :string, null: false
      add :name, :string
      add :age, :integer
      add :admin, :boolean, default: false, null: false
      add :company_id, references(:companies, on_delete: :nilify_all)

      timestamps()
    end

    create unique_index(:users, [:email])
    create index(:users, [:company_id])
  end
end

Run it with mix ecto.migrate. Roll back with mix ecto.rollback. Each migration runs in a transaction by default, so a failure rolls back automatically.

Common operations:

# Add a column
alter table(:users) do
  add :timezone, :string, default: "UTC"
end

# Remove a column
alter table(:users) do
  remove :legacy_field
end

# Rename a column
rename table(:users), :old_name, to: :new_name

# Add an index
create index(:posts, [:user_id, :inserted_at])

# Add a constraint
create constraint(:users, :age_positive, check: "age >= 0")

The change function is reversible — Ecto figures out the rollback. For non-reversible operations (data migrations, certain alters), use separate up and down functions.

Schema Changes Without Downtime

The naive way to add a NOT NULL column with a default — add :status, :string, null: false, default: "pending" — can lock your table for the duration of the rewrite. On a small table, that's fine. On a 100M-row table, that's an outage.

The pattern that scales is multi-step migration:

# Migration 1: add column nullable, no default
def change do
  alter table(:users) do
    add :status, :string
  end
end

# Deploy and update application code to write the new field

# Migration 2: backfill existing rows in batches
def up do
  flush()  # ensure the previous DDL is visible
  Repo.update_all(User, set: [status: "active"])
end

# Migration 3: add the NOT NULL constraint after backfill
def change do
  alter table(:users) do
    modify :status, :string, null: false
  end
end

For very large tables, do the backfill in batches outside of a migration — a script that processes 10,000 rows at a time, sleeping between batches. Migrations should be fast.

For indexes, add them with concurrently: true to avoid locking writes:

@disable_ddl_transaction true
@disable_migration_lock true

def change do
  create index(:posts, [:user_id], concurrently: true)
end

The two attributes disable Ecto's wrapping transaction and migration lock, both of which are required for CONCURRENTLY in Postgres.

Querying With Fragments

When the query DSL doesn't cover a Postgres feature, drop into fragment:

from p in Post,
  where: fragment("? @> ?", p.tags, ^["elixir"]),
  order_by: [fragment("? <-> ?", p.embedding, ^query_embedding)]

fragment is an escape hatch. Use it when you need JSONB operators, Postgres extensions (pg_trgm, pgvector), or anything else without a native Ecto syntax. The ? placeholders interpolate values safely.

Common Pitfalls

Forgetting the ^ pin and embedding a variable directly. Ecto catches this at compile time, but the error can be confusing if you don't know what to look for.

Building queries with string concatenation. Don't. Use dynamic/2 or compose pre-built queries.

Running a migration that adds a NOT NULL column with a default on a large table without concurrently patterns. This is the classic outage migration.

Adding indexes inside a normal migration on production data. Always use concurrently: true with the two @disable_* attributes for indexes on large tables.

Preloading inside a loop. Enum.map(users, fn u -> Repo.preload(u, :posts) end) is N queries. Repo.preload(users, :posts) is one (or two). Always preload the list, not the items.

Using Repo.all and then calling length/1 to count. Repo.aggregate(query, :count) runs a SQL COUNT(*) and fetches a single integer.

Key Takeaways

Ecto.Query is a typed DSL that compiles to SQL with no surprises — where, join, select, order_by mean what they say. Build dynamic queries by piping query modifications, using ^ to interpolate values and dynamic/2 for composable conditions. Preload associations explicitly; Ecto won't lazy-load. Migrations are versioned schema changes; for production tables, use the multi-step pattern (add nullable, backfill, constrain) and concurrently: true for indexes.