Testing distributed systems is challenging due to concurrency and nondeterminism.

Anish Athalye

Testing distributed systems for correctness requires a formal definition of what 'correct' means. Linearizability is introduced as a strong consistency model where every operation appears to execute atomically at some point between its invocation and response. While linearizability checking is NP-complete (reducible from subset sum), it works well in practice on small histories. Ad-hoc testing misses subtle bugs, whereas recording full operation histories and checking them for linearizability catches all correctness violations. The author built Porcupine, a Go-based linearizability checker thousands of times faster than the existing Knossos tool, enabling testing of histories with thousands of events in seconds.

Testing Distributed Systems for Linearizability