Hybrid search in RAG pipelines combines semantic similarity search with keyword-based BM25 search to improve retrieval quality. BM25 (Best Matching 25) is a bag-of-words ranking function built on TF-IDF that scores documents by keyword frequency relative to a query. Unlike plain TF-IDF, BM25 addresses two weaknesses: it applies

10m read timeFrom towardsdatascience.com
Post cover image
Table of contents
Starting simple with TF-IDFUnderstanding BM25 scoreRAG with Hybrid SearchOn my mind

Sort: