Slack reduced customer-impacting incidents by 90% through their Deploy Safety Program, which focused on automated detection and rollback within 10 minutes. The initiative addressed the fact that 73% of customer-facing incidents were triggered by code deployments. Key strategies included metrics-based deployment monitoring, automatic rollbacks, and iterative investment in projects showing measurable results. The program emphasized maintaining development velocity while improving reliability through automated guardrails rather than manual processes. Success required executive alignment, patience with trailing metrics, direct team engagement, and consistent tooling adoption across hundreds of internal services.
Table of contents
Defining the problemThe metricWhich projects to invest in?ResultsLessons learnedInto the futureAcknowledgementsSort: