Pagination Strategies

Why Pagination Matters

Returning an entire dataset in a single response is unsustainable. A table with 10 million rows serialized to JSON can produce gigabytes of data, overwhelming both the server and the client. Pagination breaks large datasets into manageable pages, controlling memory usage, response times, and bandwidth.

Offset Pagination

The simplest and most widely understood approach. The client specifies a page number and page size.

How It Works

GET /api/v1/users?page=3&limit=20

Translation: skip the first 40 results (page 1: 0-19, page 2: 20-39), return results 40-59

The server translates this into a database query:

SELECT * FROM users ORDER BY created_at DESC LIMIT 20 OFFSET 40

Response Format

{
  "data": [
    {"id": "user_41", "name": "Alice Chen"},
    {"id": "user_42", "name": "Bob Martinez"}
  ],
  "meta": {
    "page": 3,
    "per_page": 20,
    "total": 1547,
    "total_pages": 78
  },
  "links": {
    "first": "/api/v1/users?page=1&limit=20",
    "prev": "/api/v1/users?page=2&limit=20",
    "next": "/api/v1/users?page=4&limit=20",
    "last": "/api/v1/users?page=78&limit=20"
  }
}

The Problem with Offset Pagination

Offset pagination breaks when the underlying data changes between requests.

Scenario: Inserted rows shift results

Page 1 request: returns users 1-20
New user is inserted at position 5
Page 2 request: returns users 21-40

Problem: user 20 was at position 20, now at position 21. 
User 20 appears on both pages. User 21 is skipped entirely.

Scenario: Deleted rows shift results

Page 1 request: returns users 1-20
User 10 is deleted
Page 2 request: returns users 20-39

Problem: user 21 moved to position 20.
User 21 is skipped. No page ever returns it.

Performance Degrades at Scale

Large offsets are expensive. OFFSET 1000000 in SQL forces the database to scan and discard 1 million rows before returning the next 20. Performance degrades linearly with the page number.

Page 1:      OFFSET 0        -> fast (milliseconds)
Page 100:    OFFSET 1980     -> fine (milliseconds)
Page 10000:  OFFSET 199980   -> slow (seconds)
Page 100000: OFFSET 1999980  -> very slow (may timeout)

When to Use Offset Pagination

Despite its flaws, offset pagination works for:

Small, relatively static datasets (< 10,000 rows)
Admin dashboards where data changes are infrequent
Use cases where "jump to page N" is a requirement
Situations where the total count matters for the UI

WordPress, many traditional web applications, and internal tools use offset pagination because it maps naturally to page number UIs.

Cursor Pagination

Cursor pagination uses an opaque token (the cursor) that points to a specific position in the dataset. The client sends the cursor to get the next page.

How It Works

GET /api/v1/users?limit=20

Response includes a cursor for the next page:
{
  "data": [...20 users...],
  "meta": {
    "has_more": true
  },
  "cursors": {
    "after": "eyJpZCI6InVzZXJfMjAiLCJjcmVhdGVkX2F0IjoiMjAyNC0wMy0yMCJ9"
  }
}

Next request:
GET /api/v1/users?after=eyJpZCI6InVzZXJfMjAiLCJjcmVhdGVkX2F0IjoiMjAyNC0wMy0yMCJ9&limit=20

The cursor is typically a Base64-encoded representation of the last item's sort key:

{
  "id": "user_20",
  "created_at": "2024-03-20T10:30:00Z"
}

Why Cursors Are Stable

Cursors do not use offsets. Instead, the query looks for records after the cursor position:

SELECT * FROM users
WHERE (created_at, id) > ('2024-03-20T10:30:00Z', 'user_20')
ORDER BY created_at DESC, id DESC
LIMIT 20

If new rows are inserted or existing rows are deleted, the cursor still points to the correct position. No items are skipped or duplicated.

Performance Is Consistent

The database uses an index to jump directly to the cursor position. Whether you are on the first page or the ten-thousandth page, the query performance is the same.

Page 1:      WHERE (created_at, id) > (start)   -> fast
Page 100:    WHERE (created_at, id) > (cursor)   -> fast
Page 10000:  WHERE (created_at, id) > (cursor)   -> still fast

Real-World Usage

Stripe uses cursor pagination for all list endpoints:

GET /v1/charges?limit=10&starting_after=ch_abc123

{
  "object": "list",
  "data": [...],
  "has_more": true,
  "url": "/v1/charges"
}

Facebook Graph API uses cursors extensively:

{
  "data": [...],
  "paging": {
    "cursors": {
      "before": "MTAxNTExOTQ1MjAwNzI5NDE=",
      "after": "NDMyNzQyODI3OTQw"
    },
    "next": "https://graph.facebook.com/v19.0/me/posts?after=NDMyNzQyODI3OTQw&limit=25"
  }
}

Twitter API v2 uses pagination tokens:

{
  "data": [...],
  "meta": {
    "next_token": "7140w",
    "result_count": 10
  }
}

Limitations

Cursor pagination does not support "jump to page 5" — the client must traverse pages sequentially. It also cannot easily provide a total count without a separate query.

Keyset Pagination

Keyset pagination is cursor-based but uses the sort key directly instead of an opaque token. The cursor is the value of the sort column for the last item on the current page.

How It Works

GET /api/v1/users?sort=created_at&limit=20

Response: last user has created_at = 2024-03-15T08:00:00Z

Next request:
GET /api/v1/users?sort=created_at&created_after=2024-03-15T08:00:00Z&limit=20

This translates to:

SELECT * FROM users
WHERE created_at > '2024-03-15T08:00:00Z'
ORDER BY created_at ASC
LIMIT 20

Handling Non-Unique Sort Keys

If multiple records share the same sort key value, the keyset query can skip or duplicate records. Solve this by including a unique tiebreaker (usually the primary key):

GET /api/v1/users?created_after=2024-03-15T08:00:00Z&after_id=user_123&limit=20

SELECT * FROM users
WHERE (created_at, id) > ('2024-03-15T08:00:00Z', 'user_123')
ORDER BY created_at ASC, id ASC
LIMIT 20

Keyset vs Opaque Cursor

Keyset pagination exposes the sort values in the URL. Opaque cursors hide them behind Base64 encoding. The tradeoffs:

	Keyset	Opaque Cursor
Transparency	Client sees sort values	Opaque to client
Flexibility	Client can construct arbitrary cursors	Server controls cursor format
Stability	Can break if sort key changes	Resilient to internal changes
Simplicity	No encoding/decoding	Requires encoding logic

Most public APIs prefer opaque cursors because they can change the cursor format without breaking clients.

Choosing the Right Strategy

Criteria	Offset	Cursor	Keyset
Jump to page N	Yes	No	No
Total count	Easy	Separate query	Separate query
Large datasets	Poor performance	Excellent	Excellent
Data consistency	Breaks with changes	Stable	Stable
Real-time data	Problematic	Works well	Works well
Implementation	Simple	Moderate	Moderate

Decision Guide

Use cursor pagination for most APIs, especially those with:

Large datasets (thousands to millions of records)
Frequently changing data (new records, deletions)
Infinite scroll or "load more" UIs
Real-time feeds (activity streams, notifications)

Use offset pagination only when:

The dataset is small and relatively static (< 10,000 rows)
Users need to jump to a specific page
You need a total count for the UI
Backward compatibility requires it

Use keyset pagination when:

You want cursor-like behavior but prefer transparent URLs
The sort key is meaningful to the client
You are building an internal API where cursor format stability is less critical

Common Pitfalls

Using offset pagination for large, dynamic datasets. Performance degrades at high page numbers, and data shifts cause skipped or duplicated items. Switch to cursor pagination.

Exposing internal database IDs as cursors. If your cursor is ?after=42 (a raw database ID), you leak information about your data size and growth rate. Use opaque, encoded cursors.

Not enforcing a maximum page size. A client requesting limit=1000000 can bring down your server. Always cap page size.

Computing total count on every request. COUNT(*) on a million-row table adds hundreds of milliseconds. Make it opt-in or return estimates.

Returning a cursor that expires. Cursors should point to a position in the dataset, not depend on server-side state. A cursor that becomes invalid after a timeout forces clients to restart pagination from the beginning.

Ignoring sort stability. If your sort key is not unique (e.g., created_at with duplicate timestamps), pagination can skip or repeat records. Always include a unique tiebreaker column.

Key Takeaways

Offset pagination is simple but breaks under scale and data changes; use it only for small, static datasets or admin interfaces.
Cursor pagination is the best default for most APIs; it is stable, performant, and works correctly with real-time data.
Keyset pagination is a transparent variant of cursor pagination; prefer opaque cursors for public APIs to maintain format flexibility.
Always enforce a maximum page size, use a unique tiebreaker in sort keys, and make total count opt-in to avoid performance problems.
Every pagination response should include enough metadata for the client to know whether more data exists and how to fetch it.