7 min read
On this page

Validation Patterns

Either short-circuits. That is the right behaviour when steps depend on each other — there is no point trying to charge an account if you could not authenticate the user. But it is the wrong behaviour for forms, configuration files, and API request bodies, where you want to tell the user about every problem in one pass.

This page is about the patterns Haskell uses to encode "this data has been validated" in the type system: applicative validation that accumulates errors, smart constructors, refined types, and the parse-don't-validate philosophy that quietly underlies all of them.

Validation: applicative validation that accumulates

The validation package on Hackage (Tony Morris originally; now maintained as Data.Validation) provides a sibling to Either:

data Validation e a = Failure e | Success a

It looks identical to Either. The difference is the Applicative instance:

instance Semigroup e => Applicative (Validation e) where
  pure = Success
  Failure e1 <*> Failure e2 = Failure (e1 <> e2)
  Failure e1 <*> _          = Failure e1
  _          <*> Failure e2 = Failure e2
  Success f  <*> Success a  = Success (f a)

Two Failures combine via the semigroup of e. If e is [Error] or NonEmpty Error, you accumulate the list. If it is Set Error, you take the union.

Crucially, there is no Monad instance for Validation, by design. A monad's >>= lets the next computation depend on the previous result — but if the previous failed, there is no result to depend on, so the only sensible behaviour is short-circuit. That defeats the whole purpose. validation resists the temptation and stays Applicative-only.

Using it

{-# LANGUAGE OverloadedStrings #-}
import Data.Validation
import Data.List.NonEmpty (NonEmpty(..))
import qualified Data.Text as T

data SignupError
  = NameTooShort
  | NameTooLong
  | EmailMissingAt
  | PasswordTooShort
  | PasswordNoDigit
  deriving Show

validateName :: T.Text -> Validation (NonEmpty SignupError) T.Text
validateName n
  | T.length n < 2  = Failure (NameTooShort :| [])
  | T.length n > 64 = Failure (NameTooLong  :| [])
  | otherwise       = Success n

validateEmail :: T.Text -> Validation (NonEmpty SignupError) T.Text
validateEmail e
  | T.any (== '@') e = Success e
  | otherwise        = Failure (EmailMissingAt :| [])

validatePassword :: T.Text -> Validation (NonEmpty SignupError) T.Text
validatePassword p =
  let errs = concat
        [ [PasswordTooShort | T.length p < 8]
        , [PasswordNoDigit  | not (T.any (`elem` ("0123456789" :: String)) p)]
        ]
  in case errs of
       []     -> Success p
       (x:xs) -> Failure (x :| xs)

data Signup = Signup T.Text T.Text T.Text deriving Show

mkSignup :: T.Text -> T.Text -> T.Text
         -> Validation (NonEmpty SignupError) Signup
mkSignup n e p =
  Signup <$> validateName n <*> validateEmail e <*> validatePassword p
ghci> mkSignup "" "no-at" "abc"
Failure (NameTooShort :| [EmailMissingAt, PasswordTooShort, PasswordNoDigit])
ghci> mkSignup "Alice" "alice@example.com" "secret123"
Success (Signup "Alice" "alice@example.com" "secret123")

Three failed fields, four errors collected, all in one pass. The user sees everything wrong on the first try instead of fixing one field, hitting submit, fixing the next, repeating.

When you need to bridge Validation to Either (for further short-circuiting steps), the package provides validation and toEither/fromEither conversions.

Smart constructors

Once data is validated, you want the type system to remember that. The pattern: hide the constructor, expose only a function that validates, and the result type is now usable everywhere without re-checking.

module Email
  ( Email           -- export type but not constructor
  , mkEmail
  , unEmail
  ) where

import qualified Data.Text as T

newtype Email = Email T.Text  -- constructor not exported

mkEmail :: T.Text -> Maybe Email
mkEmail t
  | T.any (== '@') t && not (T.null t) = Just (Email (T.toLower t))
  | otherwise = Nothing

unEmail :: Email -> T.Text
unEmail (Email t) = t

Outside the module, the only way to obtain an Email is through mkEmail. Functions that take an Email parameter are guaranteed valid input — no defensive re-validation, no documentation comments saying "must contain @". The compiler enforces it.

This is the simplest and most common type-driven validation pattern in Haskell, and you will see it everywhere in production code:

newtype UserId = UserId Int            -- non-negative
newtype OrderTotal = OrderTotal Money  -- positive
newtype TrustedHtml = TrustedHtml Text -- already escaped

For domain modelling, the rule is: any time invariants exist on a value, encode them as a smart-constructed newtype. The cost is a tiny amount of plumbing; the payoff is errors at construction, not at use.

refined: types with predicate guarantees

Smart constructors give you "validated", but the predicate lives in your head. The refined library encodes the predicate in the type:

{-# LANGUAGE DataKinds #-}
import Refined

type Port = Refined (FromTo 1 65535) Int

mkPort :: Int -> Either RefineException Port
mkPort = refine

The type Refined (FromTo 1 65535) Int carries the constraint at compile time. Functions taking a Port know it is in range. The standard predicates include Positive, NonNegative, From, FromTo, SizeEqualTo, SizeLessThan, regex matches, and combinators (And, Or, Not).

Custom predicates are possible:

data NonEmptyText

instance Predicate NonEmptyText T.Text where
  validate p t
    | T.null t  = throwRefineOtherException (typeOf p) "empty"
    | otherwise = Nothing

When literal values are known at compile time, refined's template-Haskell helpers verify them then:

$$(refineTH 8080) :: Port  -- compile error if out of range

refined is genuinely useful when you have many small range/length/format constraints. It is overkill for a one-off "must contain @". Pick the lightest tool that captures the invariant.

Parse, don't validate (Alexis King)

Alexis King's 2019 essay "Parse, don't validate" is one of the most-cited pieces of Haskell-adjacent writing. The thesis: a "validation" function returns a Boolean — pass or fail. A "parse" function returns a more precise type that carries the proof of validity.

A function that takes a String and returns Bool answering "is this an email" is validation:

isEmail :: String -> Bool

After it returns True, the caller still has a String. Nothing in the type system remembers it was checked. Five lines later someone passes the same string to dropWhile and the validity is lost.

A function that takes a String and returns Maybe Email is parsing:

parseEmail :: String -> Maybe Email

After it returns Just, you have an Email. The validity is in the type. You cannot lose it.

The principle generalises:

  • "Has at least one element" — return NonEmpty a, not [a].
  • "Sorted ascending" — wrap in a newtype, expose only sort-preserving operations.
  • "UTF-8 valid bytes" — return Text, not ByteString.
  • "Already authenticated" — pass an AuthenticatedUser, not a UserId plus a comment.

Code that follows this pattern has fewer bugs because invalid states stop being expressible. The Email newtype above is parse-don't-validate. So is mkPort returning Refined. So is using NonEmpty for "must have at least one error".

The second-order benefit: deep functions stop needing to defensively re-check their inputs. sendEmail :: Email -> IO () does not run a regex on its argument. The type already says it is valid.

Data.Validity

The validity package (Tom Sydney Kerckhove) is a related but different angle: instead of preventing invalid construction, it provides a Validity typeclass for asserting validity of arbitrary values, useful for property tests and runtime sanity checks.

import Data.Validity

instance Validity User where
  validate u = mconcat
    [ check (T.length (userName u) > 0) "name non-empty"
    , check (userAge u >= 0) "age non-negative"
    ]

isValid :: Validity a => a -> Bool

Combined with genvalidity-hspec, you can write properties that say "any generated User should be valid", and the library generates millions of test cases. This is a good complement to smart constructors when you want to verify that, say, your JSON deserializer never produces an invalid User.

When to pick which

Rough guidance:

  • Form-like input, multiple independent fields: Validation (accumulate errors) plus smart constructors for the result type.
  • Single-field invariants on a primitive (port number, percentage, non-empty string): smart-constructed newtype, or refined if you have many such.
  • Domain types (Email, UserId, OrderTotal): smart-constructed newtype.
  • You want compile-time-proven values from literals: refined with refineTH.
  • You want to test that your producers always emit valid data: validity plus property tests.
  • You want to prevent invalid states entirely: parse-don't-validate as a design discipline. Construct types so that "invalid" is unrepresentable.

These compose. A form handler accumulates errors with Validation to produce a Signup value made of smart-constructed newtypes — and downstream code never re-checks any of it because the types already carry the proof.

A worked example

Pulling it all together: a signup endpoint that validates everything, accumulates errors, and produces a strongly typed result.

{-# LANGUAGE OverloadedStrings #-}

module Signup where

import Data.Validation
import Data.List.NonEmpty (NonEmpty(..))
import qualified Data.Text as T

-- Smart-constructed newtypes
newtype Username = Username T.Text deriving Show
newtype Email    = Email T.Text deriving Show
newtype Password = Password T.Text deriving Show

mkUsername :: T.Text -> Validation (NonEmpty SignupError) Username
mkUsername t
  | T.length t < 3   = Failure (NameTooShort :| [])
  | T.length t > 32  = Failure (NameTooLong  :| [])
  | T.any badChar t  = Failure (NameBadChar  :| [])
  | otherwise        = Success (Username t)
  where
    badChar c = not (c == '_' || c `elem` (['a'..'z'] ++ ['A'..'Z'] ++ ['0'..'9']))

mkEmail :: T.Text -> Validation (NonEmpty SignupError) Email
mkEmail t
  | T.any (== '@') t = Success (Email (T.toLower t))
  | otherwise        = Failure (EmailMissingAt :| [])

mkPassword :: T.Text -> Validation (NonEmpty SignupError) Password
mkPassword t =
  let errs = concat
        [ [PasswordTooShort | T.length t < 8]
        , [PasswordNoDigit  | not (T.any (`elem` ("0123456789" :: String)) t)]
        ]
  in case errs of
       []     -> Success (Password t)
       (x:xs) -> Failure (x :| xs)

data SignupError
  = NameTooShort | NameTooLong | NameBadChar
  | EmailMissingAt
  | PasswordTooShort | PasswordNoDigit
  deriving Show

data Signup = Signup Username Email Password deriving Show

parseSignup :: T.Text -> T.Text -> T.Text
            -> Validation (NonEmpty SignupError) Signup
parseSignup n e p =
  Signup <$> mkUsername n <*> mkEmail e <*> mkPassword p

The result type Signup is built from validated components. Any function downstream — createAccount :: Signup -> IO UserId — does not re-validate. The types do the bookkeeping.

Common pitfalls

Returning Bool instead of a refined type. isValid :: a -> Bool lets the proof of validity escape. Return Maybe ValidA or Either Err ValidA so the type encodes the check.

Exposing the data constructor of a smart-constructed newtype. data Email = Email T.Text exported as (..) defeats the whole pattern. Export the type and helpers, hide the constructor.

Trying to make Validation a Monad. It is not. There is a function called bindValidation for occasional use, but if you find yourself wanting >>=, you probably want Either or a small ad-hoc combinator.

Reaching for refined when a newtype would do. refined is excellent when you need many predicate-typed values, but for a single domain newtype it is heavier than a hand-written smart constructor.

Doing validation in the controller and then passing strings around. Runtime checks at the edge plus String types inside is the C version of validation. Lift the checked types into the type system and let them flow through the program.

Validating the same data twice. A symptom of not using parse-don't-validate. If you find a function defensively re-checking its arguments, the input type is too weak.

Key takeaways

  • Validation e a is Either's sibling that accumulates errors via a Semigroup. Use it for forms, configuration, and any situation where multiple parallel checks should all be reported.
  • Validation is deliberately not a Monad — that incompatibility is what makes accumulation work. Embrace it.
  • Smart constructors (newtype + private constructor + validating builder) are the simplest, most common type-driven validation pattern. Use them everywhere there is a non-trivial invariant.
  • refined lifts predicates into types when you have many constraints, especially numeric ranges. refineTH checks at compile time.
  • Data.Validity complements smart constructors by giving you property-test infrastructure for arbitrary types.
  • Alexis King's "parse, don't validate" is the philosophy behind all of this: produce a more precise type, do not return a Boolean. Make invalid states unrepresentable.
  • These patterns compose. Accumulate errors with Validation, produce smart-constructed types as the success result, and the rest of the program runs on values that are valid by construction.