Filtering, Sorting & Search
Letting Clients Get What They Need
Most API consumers do not need every record in a collection. Filtering, sorting, and search let clients narrow results to exactly what they need, reducing bandwidth, improving response times, and simplifying client-side logic.
Filtering
Filtering restricts the result set based on field values. The simplest and most common pattern uses query parameters.
Basic Field Filtering
GET /api/v1/orders?status=active
GET /api/v1/orders?status=active¤cy=usd
GET /api/v1/users?role=admin
Each query parameter maps to a field on the resource. The server applies an equality check: return only records where the field matches the value.
Range Filters
For numeric and date fields, equality is rarely enough. Support range operators with suffixed parameter names:
GET /api/v1/orders?created_after=2024-01-01&created_before=2024-03-31
GET /api/v1/products?price_min=10&price_max=50
GET /api/v1/users?age_gte=18&age_lte=65
Common suffix conventions:
_gt greater than
_gte greater than or equal
_lt less than
_lte less than or equal
_after alias for _gt on date fields
_before alias for _lt on date fields
Stripe uses this pattern for date filtering:
GET /v1/charges?created[gte]=1609459200&created[lte]=1640995199
Multiple Values (IN Filter)
Allow comma-separated values for "any of" queries:
GET /api/v1/orders?status=pending,processing,shipped
The server interprets this as: return orders where status is pending OR processing OR shipped.
Null & Existence Checks
GET /api/v1/users?email=null (users with no email)
GET /api/v1/users?email=!null (users with an email set)
GET /api/v1/orders?coupon_code=exists (orders that used a coupon)
Boolean Filters
GET /api/v1/users?verified=true
GET /api/v1/products?in_stock=false
Real-World Example: GitHub Issues API
GitHub provides rich filtering on its issues endpoint:
GET /repos/octocat/hello-world/issues?state=open&labels=bug,urgent&assignee=octocat&since=2024-01-01T00:00:00Z&sort=created&direction=desc
This returns open issues labeled "bug" or "urgent," assigned to octocat, created after January 2024, sorted by creation date descending.
Sorting
Sorting controls the order of results. Most APIs support sorting by one or more fields.
Single-Field Sorting
GET /api/v1/users?sort=created_at&order=desc
GET /api/v1/products?sort=price&order=asc
Alternative Sort Syntax
Some APIs use a more compact syntax with a prefix for direction:
GET /api/v1/users?sort=-created_at (descending)
GET /api/v1/users?sort=name (ascending, default)
The - prefix for descending is used by JSON:API and several other API standards.
Multi-Field Sorting
Sort by multiple fields to break ties:
GET /api/v1/orders?sort=status,-created_at
This sorts by status ascending first, then by creation date descending within each status group.
Sort Stability
Always include a unique field (like id) as the final sort key, even if the client does not request it. This ensures consistent ordering when the primary sort key has duplicate values — critical for reliable pagination.
Client requests: sort=created_at
Server applies: ORDER BY created_at DESC, id DESC
Allowed Sort Fields
Not every field should be sortable. Sorting requires database indexes, and sorting by unindexed fields on large tables causes slow queries. Explicitly whitelist sortable fields and reject unknown ones:
{
"error": {
"type": "invalid_parameter",
"status": 400,
"message": "Cannot sort by 'biography'. Allowed sort fields: created_at, name, email, updated_at."
}
}
Search
Search is fundamentally different from filtering. Filtering matches exact or range values on specific fields. Search matches text across one or more fields, often with relevance ranking.
Simple Text Search
GET /api/v1/users?q=john
The q parameter is the conventional name for search queries. The server searches across relevant text fields (name, email, bio) and returns matches ranked by relevance.
Search vs Filter
They can be combined:
GET /api/v1/users?q=john&role=admin&sort=-created_at
This searches for "john" among admin users, sorted by creation date descending. The search narrows the result set first, then filtering and sorting apply.
Full-Text Search Considerations
Simple LIKE '%john%' queries work for small datasets but do not scale. For production search:
- Use a dedicated search engine (Elasticsearch, Typesense, Meilisearch)
- Support quoted phrases:
q="john smith" - Handle typos and fuzzy matching
- Return relevance scores if useful to the client
GitHub Code Search
GitHub's code search API demonstrates a sophisticated search interface:
GET /search/code?q=addClass+in:file+language:js+repo:jquery/jquery
This searches for "addClass" in files, filtered to JavaScript, in the jQuery repository. The search syntax is powerful but specific to GitHub's use case.
Field Selection (Sparse Fieldsets)
Let clients request only the fields they need. This reduces payload size and database load.
Basic Field Selection
GET /api/v1/users?fields=id,name,email
{
"data": [
{"id": "user_123", "name": "Jane Smith", "email": "jane@example.com"},
{"id": "user_456", "name": "John Doe", "email": "john@example.com"}
]
}
Without field selection, the response might include 20 fields per user. With it, the client gets only what it needs.
Google APIs Approach
Google uses the fields parameter with dot notation for nested objects:
GET /gmail/v1/users/me/messages?fields=messages(id,snippet,labelIds)
JSON:API Sparse Fieldsets
JSON:API uses a per-type syntax:
GET /api/v1/articles?fields[articles]=title,body&fields[author]=name
Performance Benefits
Field selection is not just about bandwidth. If the server maps selected fields to the SQL query, it can avoid expensive joins or computed fields:
fields=id,name -> SELECT id, name FROM users (fast)
fields=id,name,stats -> SELECT id, name, ... + join to stats table (slower)
Nested Filtering
For APIs with related resources, allow filtering on nested object properties:
GET /api/v1/orders?customer.country=US
GET /api/v1/articles?author.name=John
Implementation Considerations
Nested filtering requires joins, which adds complexity:
GET /api/v1/orders?customer.country=US
SQL: SELECT orders.* FROM orders
JOIN customers ON orders.customer_id = customers.id
WHERE customers.country = 'US'
Limit nesting to one level deep. Deeply nested filters (?order.customer.address.city=London) create complex queries and make the API harder to understand.
Keep It Simple
The temptation is to build a query language. Resist it unless you are building a query API.
The Slippery Slope
Level 1: ?status=active (good)
Level 2: ?status=active&created_after=2024-01 (good)
Level 3: ?filter[status][eq]=active (getting complex)
Level 4: ?filter=status eq 'active' and (price gt 10 or category in ('a','b')) (OData)
Level 5: Custom query DSL with nested boolean logic (you built a database)
Most APIs should stop at Level 2. If clients need Level 4+ query capabilities, consider GraphQL or a dedicated query endpoint.
When Complex Filtering Is Justified
- Analytics APIs where ad-hoc querying is the core use case
- Search-as-a-service APIs (Algolia, Elasticsearch)
- Data warehouse APIs where users build custom reports
For these, a structured filter object in a POST request body is cleaner than overloading query parameters.
Common Pitfalls
Allowing filters on unindexed fields. Filtering on a column without an index causes a full table scan. Only expose filters for indexed fields, and document which fields are filterable.
Silently ignoring unknown filter parameters. If a client sends ?stauts=active (typo), silently ignoring it returns all records. Validate parameter names and return 400 for unknown ones.
Not validating filter values. ?price_min=abc should return a 400 error, not a database error or empty results. Validate types before querying.
Building a query language when you do not need one. Simple equality and range filters cover 90% of use cases. Adding boolean operators, nested conditions, and custom syntax increases complexity for both the server and the client.
Returning all fields by default. If your resource has 50 fields including expensive computed ones, returning everything by default wastes bandwidth and server resources. Consider a reasonable default field set, with opt-in for additional fields.
Making search case-sensitive. Users expect ?q=john to find "John." Use case-insensitive search by default.
Key Takeaways
- Use query parameters for filtering with simple equality, range, and multi-value patterns; follow conventions like Stripe's bracket syntax or suffix-based operators.
- Support sorting with explicit ascending/descending direction; always add a unique tiebreaker for stable pagination.
- Keep search separate from filtering; use the
qparameter for text search and dedicated search infrastructure for scale. - Offer field selection to reduce payload size and server load; map selected fields to the database query when possible.
- Keep the query interface simple for most APIs; reserve complex filter DSLs for analytics or search-focused APIs where ad-hoc querying is the core use case.