Meilisearch's CTO details how they patched LMDB to support nested read transactions on uncommitted writes, eliminating full database scans in their HNSW-based vector store. The previous approach used a full scan to collect memory-mapped pointers into a HashMap for multi-threaded access. Two key optimizations were made: removing unnecessary mean-degree stat computation (cutting 559s to ~2-3s), and replacing the ImmutableItems/ImmutableLinks full-scan approach with a new FrozenReader that lazily fetches graph nodes via nested read transactions distributed across rayon threads using thread-local storage. The LMDB patch required implementing an atomically reference-counted system in C11 to allow multiple concurrent nested read-only transactions from a parent write transaction. The result: inserting embeddings went from ~6/sec to ~20/sec, a 3x+ improvement, while also improving disk cache efficiency by loading only necessary pages.
Table of contents
Reminder of the factsStop computing useless statsFull scans are harmfulBut what's the new trick?Results: over 3x fasterSort: