Giant Swarm replaced their custom-built Kubernetes cluster management system with Cluster API (CAPA), live-migrating hundreds of enterprise AWS production clusters without downtime or data loss. The post details the technical mechanics: a CLI-based migration tool, a two-phase process covering CR migration and node transition,
Table of contents
Where we started: a custom operator stackWhy Cluster APIChoosing to migrate liveThe migration mechanicsWhat we learnedWhen to replace custom-built with open sourceLooking backSort: