When to Shard
Partitioning divides data within a single database. Sharding distributes data across multiple databases or servers. Sharding solves problems that a single PostgreSQL server cannot — but it introduces complexity that most teams underestimate.
Partitioning vs Sharding
| Aspect | Partitioning | Sharding |
|---|---|---|
| Data location | One database | Multiple databases/servers |
| Transparency | Fully transparent to queries | Requires routing logic |
| Cross-shard queries | All data is local | Requires scatter-gather or federation |
| Transactions | Standard ACID | Distributed transactions (hard) |
| Operational complexity | Moderate | High |
| When to use | Large tables, archival | Single server is not enough |
Partitioning is a database feature. Sharding is an architectural decision.
When a Single Server Is Not Enough
A well-configured PostgreSQL server can handle more than most people think:
- Storage: multi-terabyte databases are routine on modern SSDs.
- Read throughput: streaming replicas scale reads horizontally.
- Write throughput: a single server on good hardware can handle tens of thousands of write transactions per second.
- Connections: PgBouncer handles thousands of application connections with a small pool of server connections.
You need sharding when:
- Write throughput exceeds what one server can handle. Replicas only help with reads. If you are saturating the primary's I/O or CPU with writes, you need to split the writes.
- Data volume exceeds practical single-server storage. When VACUUM, backups, and recovery take unacceptably long on the full dataset.
- Regulatory or latency requirements demand data locality. Data must reside in specific geographic regions.
Most applications reach these limits later than expected. Optimize the single server first: indexes, query tuning, connection pooling, partitioning, hardware upgrades. Each step is cheaper and simpler than sharding.
Citus: Transparent Sharding
Citus is a PostgreSQL extension (now part of Microsoft) that adds distributed query execution to PostgreSQL. It makes sharding look like regular PostgreSQL.
How Citus Works
- One coordinator node receives queries.
- Multiple worker nodes store data shards.
- Tables are distributed across workers by a distribution column (shard key).
- The coordinator routes queries to the relevant workers and merges results.
Setting Up Distributed Tables
-- On the coordinator
CREATE EXTENSION citus;
-- Add worker nodes
SELECT citus_add_node('worker1', 5432);
SELECT citus_add_node('worker2', 5432);
-- Create a regular table
CREATE TABLE orders (
id bigserial,
tenant_id int NOT NULL,
customer_id int NOT NULL,
amount numeric NOT NULL,
created_at timestamptz NOT NULL
);
-- Distribute it by tenant_id
SELECT create_distributed_table('orders', 'tenant_id');
Queries That Work Well
Queries that include the distribution column are routed to a single shard:
-- Routed to one worker (fast)
SELECT * FROM orders WHERE tenant_id = 42 AND created_at > '2024-01-01';
-- Join on the distribution column (co-located join, fast)
SELECT o.*, oi.*
FROM orders o
JOIN order_items oi ON oi.tenant_id = o.tenant_id AND oi.order_id = o.id
WHERE o.tenant_id = 42;
Queries That Are Expensive
Queries without the distribution column require scatter-gather across all shards:
-- Scatter-gather: hits every worker (slow)
SELECT * FROM orders WHERE customer_id = 99;
-- Cross-shard join (very expensive)
SELECT * FROM orders o
JOIN customers c ON c.id = o.customer_id;
Reference Tables
Small tables that every shard needs (e.g., lookup tables) are replicated to every worker:
SELECT create_reference_table('countries');
Reference tables can be joined with distributed tables without cross-shard communication.
Application-Level Sharding
Instead of using an extension, route queries in the application. Each shard is a separate PostgreSQL database.
The Routing Layer
Application
|
v
Shard Router (application code or middleware)
|
+-- Shard 0: PostgreSQL on host-0 (tenant_id % 4 == 0)
+-- Shard 1: PostgreSQL on host-1 (tenant_id % 4 == 1)
+-- Shard 2: PostgreSQL on host-2 (tenant_id % 4 == 2)
+-- Shard 3: PostgreSQL on host-3 (tenant_id % 4 == 3)
Implementing the Router
-- Pseudocode for the routing function
-- shard_id = tenant_id % num_shards
-- connection = connection_pool[shard_id]
-- result = connection.execute(query)
The application must include the shard key in every query. Cross-shard queries require querying multiple databases and merging results in the application.
Pros & Cons
Pros:
- No extensions or special PostgreSQL builds.
- Full control over routing logic.
- Each shard is a standard PostgreSQL database — standard tooling, backups, monitoring.
Cons:
- Cross-shard queries are the application's problem.
- Schema migrations must be applied to every shard.
- Rebalancing shards (moving data between them) is a major operation.
- No distributed transactions without additional tooling.
Tenant-Based Sharding for Multi-Tenant SaaS
The most natural sharding strategy for SaaS applications. Each tenant's data lives on a specific shard. Since tenants rarely need to interact with each other's data, cross-shard queries are rare.
Schema Design
-- Every table includes tenant_id
CREATE TABLE orders (
id bigserial,
tenant_id int NOT NULL,
customer_id int NOT NULL,
amount numeric NOT NULL,
created_at timestamptz NOT NULL
);
-- All indexes include tenant_id
CREATE INDEX idx_orders_tenant_customer
ON orders (tenant_id, customer_id);
-- All queries include tenant_id
SELECT * FROM orders WHERE tenant_id = 42 AND status = 'pending';
Shard Mapping
-- A central mapping table (on a metadata database)
CREATE TABLE shard_mapping (
tenant_id int PRIMARY KEY,
shard_id int NOT NULL
);
-- Application looks up the shard on login or request
SELECT shard_id FROM shard_mapping WHERE tenant_id = 42;
-- Then routes all queries for that request to the correct shard
Handling Large Tenants
Some tenants may have 100x the data of others. Options:
- Dedicated shard: give large tenants their own shard.
- Rebalancing: move tenants between shards to even out the load.
- Hybrid: keep small tenants together, isolate large ones.
Rebalancing Shards
Moving data between shards is operationally difficult. Plan for it from the start.
With Citus
-- Citus can rebalance shards across workers
SELECT citus_rebalance_start();
Citus handles the data movement and query routing during rebalancing.
With Application-Level Sharding
- Create the tenant's tables on the new shard.
- Start dual-writing (write to both old and new shard).
- Copy historical data from the old shard to the new shard.
- Verify data consistency.
- Update the shard mapping to point to the new shard.
- Stop writing to the old shard.
- Clean up old data.
This is a multi-step process that requires careful coordination and is a common source of outages if done incorrectly.
The Complexity Cost
Sharding affects every part of your system:
Development
- Every query must include the shard key.
- JOINs across shards are not possible (or require scatter-gather).
- Unique constraints are per-shard, not global. Use UUIDs for global uniqueness.
- Aggregations across all data require querying every shard.
Operations
- Schema migrations must be applied to every shard. One failure means shards are out of sync.
- Backups and restores are per-shard. Point-in-time recovery across shards is not atomic.
- Monitoring must cover every shard. A problem on one shard affects one set of tenants.
Testing
- You cannot reproduce production behavior with a single-shard test environment.
- Cross-shard edge cases only appear under realistic data distribution.
Decision Framework
Before sharding, exhaust these options in order:
- Optimize queries: indexes, query rewrites, EXPLAIN ANALYZE.
- Upgrade hardware: faster SSDs, more RAM, more CPU cores.
- Add read replicas: scale reads without sharding writes.
- Partition tables: manage large tables within a single server.
- Connection pooling: PgBouncer removes the connection bottleneck.
- Archive old data: reduce the active dataset size.
If you have done all of this and the single server still cannot handle the write throughput or data volume:
- Shard with Citus if you want transparent distributed queries.
- Shard at the application level if you want full control and standard PostgreSQL on each shard.
Common Pitfalls
- Sharding prematurely. The complexity cost is permanent. A single PostgreSQL server handles far more than most applications need. Shard only when you have evidence that a single server is not enough.
- Choosing a bad shard key. The shard key must be present in every query. If you shard by user_id but most queries filter by email, you have a problem. The shard key should be the most common filter in your workload.
- Ignoring the cross-shard query problem. Queries that span shards are orders of magnitude slower. Design your schema and application to minimize them.
- Not planning for rebalancing. Shards will become uneven. If you have no way to move tenants between shards, you end up with hot shards and cold shards.
- Underestimating the migration effort. Moving from a single database to a sharded architecture touches every layer of the application. Budget months, not weeks.
- Running different schema versions across shards. Apply migrations atomically across all shards or use a migration framework that handles this.
Key Takeaways
- Sharding is for when a single PostgreSQL server cannot handle the write throughput or data volume. This happens later than most teams expect.
- Citus provides transparent sharding with distributed queries. Application-level sharding gives full control but requires more development effort.
- Tenant-based sharding is the natural fit for multi-tenant SaaS — tenant_id is the shard key, and cross-tenant queries are rare.
- The complexity cost of sharding is high and permanent: cross-shard queries, distributed migrations, per-shard operations, and rebalancing.
- Exhaust single-server optimizations before sharding: indexes, hardware, replicas, partitioning, connection pooling, and archival.
- If you must shard, choose the shard key carefully — it determines which queries are fast and which are slow.