Changesets

The first time you use Ecto, changesets feel like ceremony. Why do I have to wrap a struct just to update a field? Why can't I User.update(user, name: "new name") and call it a day?

The reason is that almost every database write involves the same problems: external data needs to be validated, types need to be coerced, only the actually-changed fields should be sent to the database, errors need to be reported back to the user with field-level granularity, and the whole thing needs to be testable without hitting the database. Changesets solve all of those, in one composable abstraction. Once you internalize the pattern, you'll start using changesets for things that have nothing to do with databases.

What a Changeset Is

A changeset is a struct that holds: the original data, a map of proposed changes, a list of errors, and metadata about which fields are required, which are valid, and what casts have been applied. You build one with cast/3, pipe it through validation and constraint functions, and hand the result to Repo.insert/1 or Repo.update/1.

defmodule MyApp.Accounts.User do
  use Ecto.Schema
  import Ecto.Changeset

  schema "users" do
    field :email, :string
    field :name, :string
    field :age, :integer
    field :password, :string, virtual: true
    field :password_hash, :string

    timestamps()
  end

  def changeset(user, attrs) do
    user
    |> cast(attrs, [:email, :name, :age, :password])
    |> validate_required([:email, :name])
    |> validate_format(:email, ~r/@/)
    |> validate_length(:password, min: 8)
    |> validate_number(:age, greater_than_or_equal_to: 13)
    |> unique_constraint(:email)
    |> hash_password()
  end

  defp hash_password(%Ecto.Changeset{valid?: true, changes: %{password: pw}} = cs) do
    put_change(cs, :password_hash, Bcrypt.hash_pwd_salt(pw))
  end

  defp hash_password(cs), do: cs
end

Calling User.changeset(%User{}, %{"email" => "alice@example.com", "name" => "Alice"}) returns a changeset you can inspect. If valid? is true, it's safe to insert. If not, the errors field tells you what's wrong.

cast/3: Type Coercion and Field Whitelisting

cast(struct_or_changeset, params, allowed_fields) is where external data enters Ecto. It does three things: filters params to only the listed fields (so a malicious client can't sneak admin: true into a request), coerces strings to the right types ("42" becomes 42, "true" becomes true, ISO datetime strings become DateTime structs), and stores the result in changes.

%User{} |> cast(%{"age" => "30", "name" => "Alice", "admin" => true}, [:age, :name])
# changes: %{age: 30, name: "Alice"}, admin filtered out

If a cast fails (e.g., "thirty" can't become an integer), the changeset gets an error on that field and valid? becomes false.

cast/3 is also where the Phoenix form pattern starts: a controller receives string-keyed params from a form, calls User.changeset/2, and the changeset is what the template renders to show errors. Failed inserts return a changeset with errors, and the form re-renders with the user's input intact.

Validations

Ecto ships a validation function for every common case:

|> validate_required([:email, :name])
|> validate_length(:email, max: 160)
|> validate_format(:email, ~r/@/)
|> validate_number(:age, greater_than: 0, less_than: 150)
|> validate_inclusion(:role, ~w(admin user guest))
|> validate_exclusion(:username, ~w(root admin system))
|> validate_acceptance(:terms_of_service)
|> validate_confirmation(:password)
|> validate_subset(:tags, ~w(elixir phoenix ecto))

Custom validations are functions that take a changeset and return a changeset:

defp validate_email_domain(changeset, allowed_domains) do
  validate_change(changeset, :email, fn :email, email ->
    domain = email |> String.split("@") |> List.last()
    if domain in allowed_domains, do: [], else: [email: "domain not allowed"]
  end)
end

validate_change/3 only runs if the field actually changed, which is the right default — you don't want to re-validate untouched fields on every update.

Database Constraints vs Validations

There's a subtle but important distinction: validations run before the database, constraints run after. A unique index on email can't be checked without asking the database — there's a race condition between checking and inserting. Ecto handles this by letting you declare the constraint:

|> unique_constraint(:email)

If the insert fails with a Postgres unique violation on the email index, Ecto translates that into a changeset error on the email field. You get the same error shape whether the duplicate was caught by validation (rare, requires a pre-check) or by the database (the common path).

Other constraint helpers:

|> foreign_key_constraint(:company_id)
|> check_constraint(:age, name: :age_positive)
|> exclusion_constraint(:room_booking, name: :no_overlap)

The constraint name in the migration must match the helper, or you have to pass name: explicitly. This is one of those details that bites you the first time.

Virtual Fields

field :password, :string, virtual: true declares a field on the schema that doesn't exist in the database. It exists on the struct, can be cast, can be validated, and is dropped before the SQL goes out.

This is how you handle things like raw passwords (cast and validate, then transform into password_hash before insert), confirmation fields (password_confirmation for validate_confirmation), and any other "input only" data.

schema "users" do
  field :email, :string
  field :password, :string, virtual: true
  field :password_confirmation, :string, virtual: true
  field :password_hash, :string
end

def changeset(user, attrs) do
  user
  |> cast(attrs, [:email, :password, :password_confirmation])
  |> validate_required([:email, :password])
  |> validate_confirmation(:password)
  |> put_password_hash()
end

defp put_password_hash(%{valid?: true, changes: %{password: pw}} = cs) do
  cs
  |> put_change(:password_hash, Bcrypt.hash_pwd_salt(pw))
  |> delete_change(:password)
end

defp put_password_hash(cs), do: cs

Embedded Schemas

Sometimes you want a structured field — say, an address with street, city, zip — without making it a separate table. embeds_one and embeds_many map a JSONB column to a schema-like struct.

defmodule MyApp.Accounts.Address do
  use Ecto.Schema
  import Ecto.Changeset

  embedded_schema do
    field :street, :string
    field :city, :string
    field :zip, :string
  end

  def changeset(address, attrs) do
    address
    |> cast(attrs, [:street, :city, :zip])
    |> validate_required([:street, :city, :zip])
  end
end

defmodule MyApp.Accounts.User do
  use Ecto.Schema

  schema "users" do
    field :email, :string
    embeds_one :address, MyApp.Accounts.Address
    embeds_many :phone_numbers, MyApp.Accounts.PhoneNumber
  end

  def changeset(user, attrs) do
    user
    |> cast(attrs, [:email])
    |> cast_embed(:address)
    |> cast_embed(:phone_numbers)
  end
end

cast_embed calls the embedded schema's changeset function. Errors bubble up with field paths, so a Phoenix form can render user[address][street] errors correctly.

Embedded schemas are also useful for non-database use cases. Need to validate a JSON payload from a webhook? Define an embedded schema, call cast with the body, and use Ecto's validation machinery without ever touching a Repo.

Changesets Without Tables

This is the move that makes changesets click. They don't have to map to a table. A "search form" or "settings update" or "API request validation" can be a changeset.

defmodule MyApp.SearchParams do
  import Ecto.Changeset

  @types %{
    query: :string,
    min_price: :decimal,
    max_price: :decimal,
    category: :string,
    sort: :string
  }

  def changeset(params) do
    {%{}, @types}
    |> cast(params, Map.keys(@types))
    |> validate_inclusion(:sort, ~w(price newest popular))
    |> validate_number(:min_price, greater_than_or_equal_to: 0)
  end
end

The {%{}, @types} form creates a "data + types" pair that acts like a schema. You get all the cast and validate machinery without a database table.

This is why I said you'd start using changesets for things unrelated to databases. Validating a Stripe webhook payload, parsing a CSV row, normalizing user input from a JSON API — all of it benefits from cast plus validation plus error reporting.

Inspecting Errors

When a changeset is invalid, errors live in cs.errors. The shape is [{field, {message, opts}}].

%{errors: [email: {"has invalid format", [validation: :format]}]} = cs

For rendering, use Ecto.Changeset.traverse_errors/2:

def errors_on(changeset) do
  Ecto.Changeset.traverse_errors(changeset, fn {msg, opts} ->
    Regex.replace(~r"%{(\w+)}", msg, fn _, key ->
      opts |> Keyword.get(String.to_existing_atom(key), key) |> to_string()
    end)
  end)
end

Phoenix forms call this for you and render <.error> components with the right message per field.

Common Pitfalls

Casting a field but not adding it to validate_required. The cast happens, the changeset is valid, you insert a row with null in a column that was supposed to require a value.

Forgetting that cast filters fields by the third argument. If a new field shows up in the schema but you don't add it to the cast list, it's silently ignored. This is a feature for security, but it bites you when you're adding fields.

Confusing virtual fields with the absence of validation. Virtual fields still get cast and validated; they just don't get persisted.

Adding unique_constraint without a matching unique index in the database. The constraint helper translates database errors into changeset errors — if there's no unique index, there's no error to translate, and duplicates slip in.

Wrapping the same struct in multiple changesets in a single transaction without using Ecto.Multi. If you need atomicity across multiple writes, use Multi, not nested changesets.

Putting business logic in the changeset that should be in the context. Changesets are for casting, validating, and tracking changes. Sending an email, charging a card, calling a webhook — those go in the context module that calls Repo.insert/1.

Key Takeaways

Changesets are the funnel for any data entering your system: they cast strings to types, filter unsafe fields, validate business rules, and translate database constraint errors into form errors. Use virtual fields for input-only data like raw passwords. Use embedded schemas for structured fields without tables. Use schemaless changesets ({%{}, @types}) for validating API payloads, search forms, and any other "external data, structured" problem. The pattern transfers — once you see changesets clearly, you'll reach for them outside of database code too.