pgit, a Git-like CLI that stores repository history in PostgreSQL using delta compression, successfully imported the entire Linux kernel history: 1,428,882 commits, 24.4 million file versions, and 20 years of development in 2 hours on a Hetzner dedicated server. The resulting 6.6 GB PostgreSQL database (2.7 GB actual data) enables SQL queries across the full history in seconds. Analysis reveals: 38,506 unique authors with 25:1 contributor-to-committer ratio, 90% of commits touching 5 or fewer files, Intel i915 and Btrfs as the most tightly coupled subsystems, David S. Miller merging 7.9% of all commits, Intel leading corporate contributions, and quirky findings like 7 f-bombs in commit messages (from 2 people), 665 bug fixes pointing to the initial git import commit, and bcachefs taking 13 years to merge into mainline.

20m read timeFrom oseifert.ch
Post cover image
Table of contents
The importCompressionWhat 1.4 million commits revealQuery performanceLinks

Sort: