Database schema migrations are a major source of outages and engineering pain, as illustrated by incidents at a major Italian bank, Linear, and GitHub. Common pitfalls include difficulty testing outside production, lengthy migration times, lack of reliable rollback, and cascading impact on integrations. Standard mitigation tactics (off-peak windows, schema-as-code tools like Liquibase, shadow tables) help but still leave gaps — especially around renaming columns, backward compatibility with old clients, and zero-downtime guarantees. The post proposes a proxy-based approach where changes are applied to a replica, production traffic is shadowed and queries are translated on the fly, and the replica is promoted only after validation — enabling live experiments, instant rollback, and single-step schema changes without downtime.

9m read timeFrom quesma.com
Post cover image
Table of contents
What is schema migration?Typical migration problemsTypical mitigation risk tacticsState-of-the-art mitigation tacticsWhy is even state-of-the-art not good enough, and how can we address it?Quesma schema migration visionResources used

Sort: