Morgan Stanley's platform engineers share a five-year journey scaling Flux-based GitOps across 500+ clusters, 2,000+ nodes, and tens of thousands of Flux resources. Starting from push-based CI/CD pipelines plagued by configuration drift and fragile recovery, they built a self-service onboarding platform leveraging Flux's RBAC-based multi-tenancy. At scale, they tuned reconciliation intervals, controller concurrency, and resource limits, and migrated from a self-hosted Git provider to S3 buckets for high availability. Observability was addressed via centralized Grafana dashboards extended with kube-state-metrics and Flux's Notification Controller. Future plans include Flux sharding, OCI artifacts as the primary source, and progressive delivery with Flagger.

6m read timeFrom fluxcd.io
Post cover image
Table of contents
The Early Days: Pushing LimitsStep 1: Security and Self-ServiceStep 2: Operating at ScaleStep 3: Observability and Feedback LoopsLooking AheadWatch the Full TalkJoin Us at FluxCon Europe 2026

Sort: