Wix Engineering describes how they built DB Mover, an internal Python microservice for zero-downtime database migrations between MySQL clusters at scale. The system uses Debezium for CDC streaming, Amazon MSK (Kafka) for event transport with Avro serialization, and a staged cutover process (read-only first, then writes) to eliminate downtime. Key design decisions include always reading from replica nodes, fixing Kafka partition counts before migration starts, using Avro over JSON for binary efficiency, and keeping the entire process transparent to application teams with no code changes required. The post also covers why alternatives like MySQL Dump, Amazon DMS, and multi-source replication were rejected, and outlines future improvements including multi-region MSK and parallel Debezium snapshots.
Table of contents
The ChallengeDB TopologyGood Tools, Doesn't FitHow DB Mover WorksDebezium ConnectorAmazon MSKThe Python ServiceThe ResultSummary: 6 Key Focal PointsSort: