Ecto Basics

Ecto is Elixir's database toolkit. It's not an ORM in the Active Record sense — there's no magic save method on your structs, no implicit lazy loading, no callbacks firing during a save. Ecto is deliberate: you describe what you want, you call a function on a Repo, and you get a result. Once you stop fighting that and start using it, you'll appreciate how much trouble it saves.

The four pieces you'll work with constantly are Ecto.Repo (the connection to your database), Ecto.Schema (the mapping from a table to a struct), Ecto.Changeset (validation and change tracking, covered separately), and Ecto.Query (a DSL for SELECT statements). This file covers Repo, Schema, and basic CRUD. Queries and changesets get their own treatment.

The Repo Is the Database

In Ecto, every database operation goes through a Repo module. You define one in your app:

defmodule MyApp.Repo do
  use Ecto.Repo,
    otp_app: :my_app,
    adapter: Ecto.Adapters.Postgres
end

And configure it in config/runtime.exs:

config :my_app, MyApp.Repo,
  url: System.get_env("DATABASE_URL"),
  pool_size: 10

Now MyApp.Repo.insert/1, MyApp.Repo.get/2, MyApp.Repo.all/1, and friends are how you talk to Postgres. There's no global database connection — the Repo holds the pool, and every function you call routes through it.

This explicitness matters. In Rails, calling User.find(1) in a model goes through the singleton database connection. In Ecto, you write Repo.get(User, 1). The function signature tells you it touches the database. There are no hidden queries.

The supported adapters are Postgres (most common), MySQL, SQLite (for embedded use cases), and a few others via community packages. PostgreSQL is the default for new Phoenix projects and gets the most attention.

Schemas: Tables to Structs

A schema is a module that describes a database table. It defines the fields, their types, and any associations. The schema is also what select queries return.

defmodule MyApp.Accounts.User do
  use Ecto.Schema

  schema "users" do
    field :email, :string
    field :name, :string
    field :age, :integer
    field :admin, :boolean, default: false
    field :metadata, :map

    has_many :posts, MyApp.Blog.Post
    belongs_to :company, MyApp.Companies.Company

    timestamps()
  end
end

schema "users" says "this struct maps to the users table." Each field declares a column with a type. The types are Elixir-side types: :string, :integer, :boolean, :float, :decimal, :date, :time, :utc_datetime, :naive_datetime, :map, :binary, plus Ecto-specific ones like Ecto.UUID.

timestamps() adds inserted_at and updated_at columns. Ecto manages them automatically.

The schema also defines a struct: you can do %User{name: "Alice"} and pass it around. It has all the schema fields plus __meta__ (Ecto's bookkeeping), id (the primary key), and association fields.

Inserting

{:ok, user} = Repo.insert(%User{email: "alice@example.com", name: "Alice"})

This is the no-validation path — it just inserts the row. The return is {:ok, user} on success or {:error, changeset} on a database-level failure (unique constraint violation, for example).

Direct struct inserts are mostly for seed data and tests. In real code, you go through a changeset (covered in the changesets file) so you get validation, casting, and error reporting.

Reading

The basic reads are get/2, get!/2, get_by/2, all/1, and one/1.

# By primary key, returns nil if missing
user = Repo.get(User, 42)

# Same, but raises if missing
user = Repo.get!(User, 42)

# By a field
user = Repo.get_by(User, email: "alice@example.com")

# All rows
users = Repo.all(User)

# Single result from a query, raises if more than one
admin = Repo.one(from u in User, where: u.email == "admin@example.com")

The bang versions (get!, one!) raise Ecto.NoResultsError when nothing matches. Use them in controllers where the route would 404 anyway — let it crash and Phoenix turns it into a 404. Use the non-bang versions when you need to handle the missing case explicitly.

Updating

user = Repo.get!(User, 42)
{:ok, updated} = user |> Ecto.Changeset.change(name: "Alice Smith") |> Repo.update()

Notice you can't update a struct directly — you wrap it in a changeset first. This is intentional: Ecto wants to know what changed, so it only sends the modified columns to the database. A struct update would have to either send everything or guess.

The Ecto.Changeset.change/2 form is the simplest changeset and skips validation. In real code you'd use a typed changeset function on your schema, but for ad-hoc updates this works.

There's also Repo.update_all/2 for bulk updates that don't need per-row logic:

from(u in User, where: u.admin == false)
|> Repo.update_all(set: [admin: true])

That fires a single SQL UPDATE. It bypasses changesets entirely, so use it when you have a clear bulk operation, not as a shortcut around validation.

Deleting

user = Repo.get!(User, 42)
{:ok, _} = Repo.delete(user)

And the bulk version:

from(u in User, where: u.inserted_at < ^one_year_ago) |> Repo.delete_all()

Same caveat: delete_all/1 skips changesets and any validation. It's the right tool for cleanup jobs.

A Practical Example

A typical context module wraps these operations behind named functions, so the rest of your app doesn't reach into Repo directly. Phoenix generators encourage this pattern.

defmodule MyApp.Accounts do
  alias MyApp.Repo
  alias MyApp.Accounts.User

  def list_users, do: Repo.all(User)

  def get_user!(id), do: Repo.get!(User, id)

  def get_user_by_email(email), do: Repo.get_by(User, email: email)

  def create_user(attrs) do
    %User{}
    |> User.changeset(attrs)
    |> Repo.insert()
  end

  def update_user(user, attrs) do
    user
    |> User.changeset(attrs)
    |> Repo.update()
  end

  def delete_user(user), do: Repo.delete(user)
end

This is the seam between your business logic and your storage. LiveViews and controllers call Accounts.create_user/1, not Repo.insert/1 with a User struct. If you ever need to swap Postgres for something else, audit caching, or add events, you have one place to do it.

Why Explicit Queries Matter

Ecto's design choice to make queries explicit — no lazy loading, no implicit N+1 — gets pushback from people coming from Rails or Django. The trade is real: there's no magic where you call user.posts and get a list of posts. You have to Repo.preload(user, :posts) first.

The win is that you can read code and predict what hits the database. In a Rails app, a single page render can fire fifty queries you didn't write. In an Ecto app, every query is a function call you can grep for. When something gets slow, you can see why.

This matters more as the team and codebase grow. Discord's Elixir backend handles billions of messages a day partly because there's no hidden database access — every query is intentional, every preload is chosen.

Transactions

Multi-step writes that need to be atomic go through Repo.transaction/1:

Repo.transaction(fn ->
  user = Repo.insert!(user_changeset)
  Repo.insert!(profile_changeset_for(user))
  Repo.insert!(welcome_event_for(user))
  user
end)

If any step raises, the transaction rolls back. The return is {:ok, value} or {:error, reason}.

For more structured multi-step workflows, Ecto.Multi is cleaner. It builds a list of operations, then runs them atomically with named results:

Ecto.Multi.new()
|> Ecto.Multi.insert(:user, user_changeset)
|> Ecto.Multi.insert(:profile, fn %{user: user} -> profile_changeset(user) end)
|> Ecto.Multi.run(:notify, fn _repo, %{user: user} ->
  Notifications.welcome(user)
end)
|> Repo.transaction()

The result is {:ok, %{user: ..., profile: ..., notify: ...}} on success, or {:error, failed_step, failed_value, changes_so_far} on the first failure. Multi shines when you have several writes that depend on each other and you want named, inspectable failures.

Working With Postgres-Specific Features

Ecto exposes JSON columns, arrays, ranges, and other Postgres types as native Elixir values. A :map field round-trips through JSONB. A {:array, :string} field maps to a Postgres text[].

schema "events" do
  field :tags, {:array, :string}
  field :payload, :map
  field :occurred_during, Ecto.Range
end

You can query these with the query DSL using fragment functions, but for many cases the native operators just work. JSON path queries, array containment, and full-text search all have idiomatic Ecto forms (sometimes via fragment).

Common Pitfalls

Calling Repo.get with a non-integer when the primary key is an integer. It returns nil, not an error. If you're getting unexpected nils, log the type of the ID you're passing.

Inserting a struct without a changeset, then being surprised that bad data made it in. Direct inserts skip validation. Always go through a changeset for user input.

Forgetting that Repo.preload triggers more queries. Repo.preload(users, :posts) fires one extra query (or two, if posts has its own preloads). For lists, that's fine and cleaner than a join. For repeated calls in a loop, it's an N+1.

Not setting pool_size for the workload. The default of 10 is fine for development, often too low for production. Match it to your concurrency: how many simultaneous queries do you expect? Add headroom.

Treating schemas as your domain model. Schemas are database mappings. Your domain might have richer types — wrap them in modules that have schemas as a backing store, don't expose User structs everywhere.

Key Takeaways

Ecto separates the database (Repo) from the table mapping (Schema) from validation (Changeset) from querying (Query). Every database operation is a function call, never magic. Use Repo.get!/2 and Repo.one!/1 when you want a missing row to crash, the non-bang versions when you'll handle nil. Wrap Repo calls in context modules so the rest of your app doesn't depend on Ecto details. The explicitness is the point — it makes performance and correctness inspectable instead of hidden.