Netflix reduced transient deployment failures from 4% to 0.0001% by migrating cloud operation orchestration from Spinnaker's homegrown system to Temporal's durable execution platform. The original Clouddriver service suffered from complex internal orchestration, instance-local state, and unreliable retry logic. By implementing cloud operations as Temporal workflows with activities, Netflix eliminated tight coupling between services, removed thousands of lines of custom orchestration code, and gained automatic retries, state persistence, and better observability. The migration used abstraction layers and dynamic configuration to transparently onboard all applications within two quarters.

14m read timeFrom netflixtechblog.com
Post cover image
Table of contents
Temporal: Basic ConceptsCloud Operations with TemporalGet Netflix Technology Blog ’s stories in your inboxResults and Lessons Learned from the Migration

Sort: