Streaming Replication
Streaming replication sends WAL (Write-Ahead Log) records from a primary server to one or more replicas in real time. The replicas apply these records to maintain an identical copy of the database. This is PostgreSQL's built-in mechanism for read scaling and high availability.
How It Works
- The primary writes WAL records to disk as part of normal operation.
- A WAL sender process on the primary streams these records to each replica.
- A WAL receiver process on the replica receives and writes the records.
- The replica's startup process replays the WAL records to update the data.
This is physical replication — the replica is a byte-for-byte copy of the primary, including all databases, schemas, and system catalogs.
Synchronous vs Asynchronous
Asynchronous Replication (Default)
The primary does not wait for replicas to confirm they received the WAL. This means:
- Zero impact on primary write performance.
- There is a small window where data committed on the primary has not reached the replica. If the primary crashes, that data is lost.
# postgresql.conf on the primary (default — no special setting needed)
synchronous_commit = on # commits wait for local WAL flush, not replica
Synchronous Replication
The primary waits for at least one replica to confirm before reporting a commit as successful.
# postgresql.conf on the primary
synchronous_standby_names = 'replica1'
synchronous_commit = on
Synchronous modes (set via synchronous_commit):
- on: waits for local WAL flush (default).
- remote_write: waits for replica to receive and write WAL to OS cache.
- remote_apply: waits for replica to apply the WAL (strongest — reads on the replica see committed data immediately).
# Strongest synchronous mode
synchronous_commit = remote_apply
synchronous_standby_names = 'FIRST 1 (replica1, replica2)'
Synchronous replication adds latency to every commit (network round trip to the replica). Use it when data loss on failover is unacceptable.
Setting Up Streaming Replication
Step 1: Configure the Primary
# postgresql.conf
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB # or use replication slots
# pg_hba.conf — allow the replica to connect for replication
host replication replicator 10.0.1.0/24 scram-sha-256
Create the replication user:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'strong_password';
Step 2: Take a Base Backup
On the replica server:
pg_basebackup -h primary-host -U replicator -D /var/lib/postgresql/16/main \
--checkpoint=fast --wal-method=stream -R
The -R flag creates a standby.signal file and writes the connection info to postgresql.auto.conf. This tells PostgreSQL to start in standby mode.
Step 3: Configure the Replica
The -R flag from pg_basebackup handles this automatically. The replica's postgresql.auto.conf will contain:
primary_conninfo = 'host=primary-host user=replicator password=strong_password'
If you need to set it manually:
# postgresql.conf on the replica
hot_standby = on # allow read queries on the replica
Create the standby signal file:
touch /var/lib/postgresql/16/main/standby.signal
Step 4: Start the Replica
Start PostgreSQL on the replica. It will connect to the primary and begin streaming WAL.
Step 5: Verify
On the primary:
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn
FROM pg_stat_replication;
client_addr | state | sent_lsn | write_lsn | flush_lsn | replay_lsn
-------------+-----------+------------+------------+------------+------------
10.0.1.5 | streaming | 0/5000060 | 0/5000060 | 0/5000060 | 0/5000060
Replication Slots
Replication slots ensure the primary retains WAL needed by the replica, even if the replica is disconnected.
-- On the primary
SELECT pg_create_physical_replication_slot('replica1_slot');
# On the replica (in postgresql.auto.conf or primary_conninfo)
primary_slot_name = 'replica1_slot'
Warning: If a replica goes down for a long time, the slot causes WAL to accumulate on the primary, potentially filling the disk. Monitor slot lag:
SELECT slot_name,
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS lag_bytes
FROM pg_replication_slots;
Set max_slot_wal_keep_size to limit WAL retention per slot.
Monitoring Replication Lag
Replication lag is the delay between a change on the primary and its availability on the replica.
On the Primary
SELECT client_addr,
state,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag_bytes,
replay_lag
FROM pg_stat_replication;
On the Replica
SELECT now() - pg_last_xact_replay_timestamp() AS replication_delay;
This is an approximation — it measures the time since the last replayed transaction. During quiet periods, it grows even though the replica is caught up.
A more accurate check on the replica:
SELECT pg_is_in_recovery(); -- true if this is a replica
SELECT pg_last_wal_receive_lsn(); -- last WAL position received
SELECT pg_last_wal_replay_lsn(); -- last WAL position applied
SELECT pg_wal_lsn_diff(
pg_last_wal_receive_lsn(),
pg_last_wal_replay_lsn()
) AS apply_lag_bytes;
Read Replicas for Scaling Reads
With hot_standby = on, replicas accept read-only queries. This lets you scale reads by sending SELECT queries to replicas.
-- On the replica: read queries work
SELECT * FROM orders WHERE customer_id = 42;
-- Write queries fail
INSERT INTO orders (customer_id, amount) VALUES (42, 99.99);
-- ERROR: cannot execute INSERT in a read-only transaction
Application-Level Read Routing
Configure your application to send writes to the primary and reads to a replica:
Primary: host=primary-host port=5432 (writes)
Replica: host=replica-host port=5432 (reads)
Be aware of replication lag — a read sent to the replica immediately after a write to the primary may return stale data.
Replica Limitations & Workarounds
The replica is read-only. Some limitations:
- No temp tables by default. Workaround: use CTEs or subqueries instead. PostgreSQL 14+ supports temp tables on standbys with some restrictions.
- No CREATE INDEX. Any schema changes must happen on the primary.
- Queries may be canceled due to recovery conflicts. If the primary vacuums a page that a replica query is reading, the replica must cancel the query or delay WAL replay.
# On the replica: allow queries more time before cancellation
max_standby_streaming_delay = 30s # default is 30s
hot_standby_feedback = on # tell primary about replica queries
hot_standby_feedback prevents the primary from vacuuming rows still needed by replica queries, but it can cause bloat on the primary.
Common Pitfalls
- Not monitoring replication slot WAL retention. A disconnected replica with a slot can fill the primary's disk with accumulated WAL.
- Sending reads to replicas without accounting for lag. If a user creates a record and immediately reads it back from a replica, they may not see it.
- Relying on timestamp-based lag measurement during quiet periods. The pg_last_xact_replay_timestamp() approach shows increasing lag when there are simply no transactions to replay.
- Setting synchronous_commit = remote_apply without understanding the latency impact. Every commit now waits for a network round trip plus replay time.
- Forgetting to configure pg_hba.conf for replication connections. The replica connection will fail with an authentication error.
- Not testing replica promotion. The first time you promote a replica should not be during an actual outage.
Key Takeaways
- Streaming replication sends WAL from primary to replicas continuously. Replicas are byte-for-byte copies of the primary.
- Asynchronous replication has no write performance impact but allows a small data loss window. Synchronous replication eliminates that window at the cost of commit latency.
- Replication slots ensure WAL retention for disconnected replicas but must be monitored to prevent disk exhaustion.
- Read replicas scale read throughput. Account for replication lag in your application.
- Use hot_standby_feedback to reduce query cancellations on replicas, with the trade-off of potential bloat on the primary.
- Always test replica promotion before you need it in production.