Database sharding is a horizontal scaling technique that distributes data across multiple database instances using a shard key. The guide covers the four main sharding strategies—hash-based, range-based, directory-based, and geographic—along with how to pick an effective shard key based on cardinality, distribution, and query alignment. It emphasizes that sharding should be a last resort after exhausting vertical scaling, read replicas, indexing, query optimization, and partitioning. The real cost of sharding lies in operations: cross-shard scatter-gather queries, live rebalancing, coordinated schema migrations across all shards, and per-shard monitoring that multiplies metric cardinality. Database-specific options like Citus for PostgreSQL, Vitess for MySQL, and built-in sharding in MongoDB and distributed SQL databases (CockroachDB, TiDB, YugabyteDB) are also compared.

9m read timeFrom last9.io
Post cover image
Table of contents
How Database Sharding WorksSharding StrategiesSharding vs PartitioningPicking a Shard KeyWhen to Shard (and When Not To)Operational Cost of ShardingDatabase-Specific Sharding OptionsKey TakeawaysSee Every Database in Your Cloud from One PlaceFAQs

Sort: