Best of RAG — January 2026

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·19w
6 Components of Context Engineering
Context engineering is the practice of optimizing how information flows to AI models, comprising six core components: prompting techniques (few-shot, chain-of-thought), query augmentation (rewriting, expansion, decomposition), long-term memory (vector/graph databases for episodic, semantic, and procedural memory), short-term memory (conversation history management), knowledge base retrieval (RAG pipelines with pre-retrieval, retrieval, and augmentation layers), and tools/agents (single and multi-agent architectures, MCPs). While model selection and prompts contribute only 25% to output quality, the remaining 75% comes from properly engineering these context components to deliver the right information at the right time in the right format.
87
3
2
Article
Netflix TechBlog·17w
The AI Evolution of Graph Search
Netflix evolved their Graph Search platform to support natural language queries by integrating LLMs. The system converts user questions into structured DSL filter statements through a multi-stage process: RAG-based context engineering to identify relevant fields and controlled vocabulary values, LLM-based generation with carefully crafted instructions, and deterministic validation for syntactic and semantic correctness. Key innovations include field and vocabulary RAG to manage context size, UI visualization of generated filters as chips and facets, and @mention functionality for explicit entity selection. This approach bridges the gap between complex federated graph queries and intuitive user intent while maintaining trust through transparency.
64
3
Article
Faun·17w
20 Free & Open-Source AI Tools to Run Production-Grade Agents Without Paying LLM APIs in 2026
A curated list of 20 open-source tools for running AI agents locally without relying on paid LLM APIs. Covers inference engines (Ollama, vLLM, LiteLLM), agent orchestrators (LangGraph, CrewAI, AutoGen), RAG and vector databases (LlamaIndex, ChromaDB, Qdrant), development tools (Continue.dev, Promptfoo), and multimodal processing (Whisper.cpp, Diffusers). Includes a quickstart stack using Docker and pip for deploying production-grade agents with zero marginal cost after hardware investment.
53
2
4
Article
Niklas Heidloff·20w
From Zero to Agentic Search in 15 Minutes with OpenRAG
OpenRAG is an IBM-led open-source package that bundles Docling (document text extraction), Langflow (visual flow builder), and OpenSearch (semantic search) to enable rapid development of agentic RAG applications. It supports multiple model providers (OpenAI, Anthropic, watsonx.ai, Ollama) and cloud storage integrations (AWS, Google, Microsoft), allowing developers to build and customize RAG pipelines locally within minutes.
48
2
5
Article
Daily Dose of Data Science | Avi Chawla | Substack·17w
Your RAG System Has a Hidden UX Problem
RAG systems often use semantic retrieval but fall back to keyword-based highlighting, creating a UX disconnect where users can't see why documents are relevant. Zilliz released an open-source semantic highlighting model that identifies semantically relevant text spans instead of just keyword matches. The bilingual model (English/Chinese) handles 8K context windows, runs fast enough for production use, and outperforms existing solutions on both in-domain and out-of-domain benchmarks. It's being integrated into Milvus as a native API and is available on Hugging Face under MIT license.
26
6
Article
databricks·19w
How 7‑Eleven Transformed Maintenance Technician Knowledge Access with Databricks Agent Bricks
7-Eleven built an AI-powered Technician's Maintenance Assistant using Databricks Agent Bricks to help field technicians quickly access equipment documentation. The system uses vector search with embeddings, routing agents for document and image queries, and integrates with Microsoft Teams. By migrating from AWS services to a unified Databricks platform, they reduced response times from minutes to seconds, eliminated manual reindexing, and improved first-time-fix rates while lowering infrastructure overhead.
23
7
Article
Weaviate·18w
Announcing the Weaviate C# Client
Weaviate has released an official C# client library (v1.0.0) for .NET developers. The client features a collection-centric API design, strong typing with generic support, fluent chainable filtering, integrated vector search and RAG capabilities, dependency injection support, and comprehensive error handling. Key features include automatic connection management, type-safe schema generation from C# classes, backup/restore functionality, and seamless integration with modern .NET applications. The library is available via NuGet and aims to provide a native, intuitive experience for building AI-powered applications in the .NET ecosystem.
19

See all RAG archives