Form3, a UK payments platform processing billions of pounds annually, built a triple-active multi-cloud architecture spanning AWS, Google Cloud, and Azure in response to UK banking regulators' concerns about cloud concentration risk. Their V2 platform uses independent Kubernetes clusters per cloud connected via private links, NATS JetStream as a cross-cloud message broker, and CockroachDB for distributed storage. Key engineering challenges included a custom DNS pseudo-suffix scheme for cross-cluster CockroachDB bootstrapping, a custom XPDB operator for cross-cluster pod disruption budgets, and a Cluster Lifecycle Operator to manage node pool updates at scale. The architecture proved its value during a major GCP outage, with payments continuing uninterrupted. However, when Form3 expanded to the US, customers expected traditional East/West geographic failover rather than multi-cloud redundancy, and continental latency made CockroachDB quorum impractical. They fell back to an active-standby model with AWS East and GCP West, and are now adding CockroachDB logical replication and NATS stream replication to reduce recovery time. The key lesson: triple-active multi-cloud is powerful but only worthwhile if your market demands it, your budget supports it, and you have a strong platform engineering team.
Sort: