The Rails Infrastructure team built an open-source benchmarking toolkit (bundler-perf-toolkit) to reliably measure Bundler performance improvements, achieving a 3x speedup on cold installs for large Gemfiles. The post details the variables affecting bundle install time, how 'cold' and 'warm' cache scenarios were precisely defined, and how the toolkit uses hyperfine, a fake gemserver, and profiling tools like Vernier and Samply. A key lesson: Claude was useful for scaffolding setup scripts and the initial benchmark, but missed a subtle cache isolation bug (wrong BUNDLE_USER_HOME env var) that only domain expertise could catch. The takeaway is that AI can accelerate boilerplate work but cannot replace engineering judgment when numbers don't add up.
Table of contents
What affects bundle install timeWhat we builtHow we defined what to measureUsing the benchmark scriptWhat we learnedSort: