Here's a recent comment on LinkedIn from John Allspaw, on a post by Gandhi Mathi Nathan Kumar about availability. Allspaw's comment is a succinct description of a safety model proposed by the Danish resilience engineering researcher Erik Hollnagel: Safety-II. Hollnagel has described Safety-II in his book Safety-I and Safety-II: The Past and Future of Safety…

Surfing Complexity

John Allspaw's LinkedIn comment prompts a discussion of Erik Hollnagel's Safety-II model and its implications for software reliability. Safety-II reframes reliability not as the absence of failures but as the result of active, everyday work that continuously prevents incidents. Rather than focusing solely on what went wrong during incidents (Safety-I), Safety-II asks organizations to study how normal work consistently goes right and to amplify those practices. The author argues this is a radical shift that cuts against industry intuitions, notes that almost no tech organizations currently practice it, and acknowledges that the resilience-in-software community is trying to push incident analysis in this direction — but has a long way to go.

The normal work of creating reliability