Yelp's Database Reliability Engineering team upgraded over a thousand Cassandra nodes from version 3.11 to 4.1 with zero downtime. The post covers their upgrade strategy including benchmarking (4% p99 latency improvement, 11% throughput gain, up to 58% p99 reduction on key clusters), compatibility challenges with Stargate proxy
Table of contents
MotivationComponents1. Verifying Public Benchmarks2. Avoid Hard Blocking3. Seamless Upgrade4. Production Qualification Criteria5. Automate As Much As PossibleUpgrade strategyCompatibility ChangesOverall Upgrade ProcessPerformance ImpactSchema DisagreementImproved PerformanceFaster and More Stable RestartsSeamless Upgrade1 Comment
Sort: