Database Scaling

A single database server has limits on connections, disk I/O, and CPU. Three techniques push past those limits: read replicas, vertical sharding, and horizontal sharding. Read replicas copy data from the primary asynchronously. Reads go to replicas; writes go to the primary. This works when reads outnumber writes — which describes most web applications. Vertical sharding splits different tables across different databases: users in one, orders in another. Horizontal sharding splits one table across multiple database servers by a shard key. It is powerful but complex — cross-shard joins are impossible, and a bad shard key creates hot spots that defeat the purpose.

Before

Single primary — bottleneck under read load

// All reads and writes hit one server
Application → Postgres Primary (reads + writes)

Problems at scale:
→ CPU saturates on complex read queries
→ Connection pool exhausted
→ One failure domain — no redundancy

After

Primary + read replicas

// Writes → primary only. Reads spread across replicas.
Application → Postgres Primary      (58 writes/sec)
           ↘ Replica 1  (200 reads/sec)
           ↘ Replica 2  (200 reads/sec)
           ↘ Replica 3  (180 reads/sec)

// Rule: always use primary for writes and
//       reads that require the latest data.

Key Takeaway

Add read replicas first — they solve 90% of database scale problems with minimal complexity.