Best of RAGJanuary 2026

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·19w

    6 Components of Context Engineering

    Context engineering is the practice of optimizing how information flows to AI models, comprising six core components: prompting techniques (few-shot, chain-of-thought), query augmentation (rewriting, expansion, decomposition), long-term memory (vector/graph databases for episodic, semantic, and procedural memory), short-term memory (conversation history management), knowledge base retrieval (RAG pipelines with pre-retrieval, retrieval, and augmentation layers), and tools/agents (single and multi-agent architectures, MCPs). While model selection and prompts contribute only 25% to output quality, the remaining 75% comes from properly engineering these context components to deliver the right information at the right time in the right format.

  2. 2
    Article
    Avatar of netflixNetflix TechBlog·17w

    The AI Evolution of Graph Search

    Netflix evolved their Graph Search platform to support natural language queries by integrating LLMs. The system converts user questions into structured DSL filter statements through a multi-stage process: RAG-based context engineering to identify relevant fields and controlled vocabulary values, LLM-based generation with carefully crafted instructions, and deterministic validation for syntactic and semantic correctness. Key innovations include field and vocabulary RAG to manage context size, UI visualization of generated filters as chips and facets, and @mention functionality for explicit entity selection. This approach bridges the gap between complex federated graph queries and intuitive user intent while maintaining trust through transparency.

  3. 3
    Article
    Avatar of faunFaun·17w

    20 Free & Open-Source AI Tools to Run Production-Grade Agents Without Paying LLM APIs in 2026

    A curated list of 20 open-source tools for running AI agents locally without relying on paid LLM APIs. Covers inference engines (Ollama, vLLM, LiteLLM), agent orchestrators (LangGraph, CrewAI, AutoGen), RAG and vector databases (LlamaIndex, ChromaDB, Qdrant), development tools (Continue.dev, Promptfoo), and multimodal processing (Whisper.cpp, Diffusers). Includes a quickstart stack using Docker and pip for deploying production-grade agents with zero marginal cost after hardware investment.

  4. 4
    Article
    Avatar of heidloffNiklas Heidloff·20w

    From Zero to Agentic Search in 15 Minutes with OpenRAG

    OpenRAG is an IBM-led open-source package that bundles Docling (document text extraction), Langflow (visual flow builder), and OpenSearch (semantic search) to enable rapid development of agentic RAG applications. It supports multiple model providers (OpenAI, Anthropic, watsonx.ai, Ollama) and cloud storage integrations (AWS, Google, Microsoft), allowing developers to build and customize RAG pipelines locally within minutes.

  5. 5
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·17w

    Your RAG System Has a Hidden UX Problem

    RAG systems often use semantic retrieval but fall back to keyword-based highlighting, creating a UX disconnect where users can't see why documents are relevant. Zilliz released an open-source semantic highlighting model that identifies semantically relevant text spans instead of just keyword matches. The bilingual model (English/Chinese) handles 8K context windows, runs fast enough for production use, and outperforms existing solutions on both in-domain and out-of-domain benchmarks. It's being integrated into Milvus as a native API and is available on Hugging Face under MIT license.

  6. 6
    Article
    Avatar of databricksdatabricks·19w

    How 7‑Eleven Transformed Maintenance Technician Knowledge Access with Databricks Agent Bricks

    7-Eleven built an AI-powered Technician's Maintenance Assistant using Databricks Agent Bricks to help field technicians quickly access equipment documentation. The system uses vector search with embeddings, routing agents for document and image queries, and integrates with Microsoft Teams. By migrating from AWS services to a unified Databricks platform, they reduced response times from minutes to seconds, eliminated manual reindexing, and improved first-time-fix rates while lowering infrastructure overhead.

  7. 7
    Article
    Avatar of weaviateWeaviate·18w

    Announcing the Weaviate C# Client

    Weaviate has released an official C# client library (v1.0.0) for .NET developers. The client features a collection-centric API design, strong typing with generic support, fluent chainable filtering, integrated vector search and RAG capabilities, dependency injection support, and comprehensive error handling. Key features include automatic connection management, type-safe schema generation from C# classes, backup/restore functionality, and seamless integration with modern .NET applications. The library is available via NuGet and aims to provide a native, intuitive experience for building AI-powered applications in the .NET ecosystem.