Slack improved their Chef infrastructure safety by splitting a single production environment into six isolated buckets (prod-1 through prod-6) mapped to availability zones, implementing a release train model with staggered rollouts. They built Chef Summoner, a service that triggers Chef runs based on S3 signals rather than

15m read time From slack.engineering
Post cover image
Table of contents
Splitting Chef EnvironmentsWhat changed in the way we trigger Chef?What’s Next?

Sort: