A practical guide to using bigram_index to accelerate phrase queries in Manticore Search, with clear explanations of all, first_freq, both_freq, and a reproducible manticore-load benchmark.

Manticore

A practical guide to using bigram_index in Manticore Search to accelerate phrase queries. Covers all four modes: default (no bigrams), all (every adjacent token pair indexed), first_freq (pair stored when first token is in a frequent-word list), and both_freq (pair stored only when both tokens are frequent). A 1M-document benchmark shows bigram_index='all' delivers ~2.9x QPS improvement and ~3.2x latency reduction for phrase queries, at the cost of slower indexing (~17k vs ~45k docs/sec). Includes reproducible manticore-load benchmark commands and guidance on choosing the right mode for different workloads.

How to Speed Up Phrase Search with bigram

Important caveat: bigrams work at tokenization level

Which performance mode should you choose?

Benchmark: does bigram indexing really speed up phrase search?