Testing distributed systems for correctness requires a formal definition of what 'correct' means. Linearizability is introduced as a strong consistency model where every operation appears to execute atomically at some point between its invocation and response. While linearizability checking is NP-complete (reducible from subset sum), it works well in practice on small histories. Ad-hoc testing misses subtle bugs, whereas recording full operation histories and checking them for linearizability catches all correctness violations. The author built Porcupine, a Go-based linearizability checker thousands of times faster than the existing Knossos tool, enabling testing of histories with thousands of events in seconds.

10m read timeFrom anishathalye.com
Post cover image
Table of contents
Testing Distributed Systems for LinearizabilityCorrectnessTestingEffectiveness

Sort: