How a Single Line of Code Made a 24-core Server Slower Than a Laptop. Piotr Kołaczkowski wrote a program for a pleasingly parallel problem, where each thread does its own independent piece of work, and the threads don't need to coordinate except joining the results at the end.

13m read timeFrom pkolaczk.github.io
Post cover image
Table of contents
Rune scriptingBenchmarking the benchmarking programRunning an empty loop on 24 coresInvestigationThe problemThe fixFinal resultsTakeaways

Sort: