A real-world bug report about a customer-support RAG agent returning stale policy documents reveals a fundamental architectural flaw: vector similarity alone cannot enforce recency, tenant isolation, or document scope. The solution is hybrid search — combining vector distance functions with SQL WHERE clauses, JOINs, and GROUP BY in a single database query. Three concrete SQL patterns are shown: recency filtering (pruning stale docs), tenant isolation via permission table joins (eliminating cross-tenant data leaks), and category ranking with aggregation. Benchmarks on a 10M-row corpus show hybrid search improves Recall@5 from 72% to 94%, Precision@5 from 58% to 87%, reduces stale doc rate from 23% to under 1%, and eliminates cross-tenant leaks entirely, at a latency cost of only 15–30ms. The post also critiques the 'vector sidecar' anti-pattern of running a separate vector database alongside a primary relational DB, arguing that native vector support in a single database (like TiDB) eliminates sync pipelines, consistency windows, and operational complexity.
Table of contents
And how to fix it with hybrid searchThe gap nobody talks aboutWhat I mean by hybrid searchWhat the numbers look likeThe “vector sidecar” anti-patternWhy SQL compatibility matters hereWhen you don’t need hybrid searchThe middle layerSort: