Best of RAGNovember 2025

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·29w

    RAG vs. CAG, Explained Visually!

    Cache-Augmented Generation (CAG) improves upon traditional RAG by caching static, rarely-changing information directly in the model's key-value memory, while continuing to retrieve dynamic data from vector databases. This hybrid approach reduces redundant fetches, lowers costs, and speeds up inference by separating stable "cold" data (cacheable) from frequently updated "hot" data (retrievable). The technique is already supported by APIs like OpenAI and Anthropic through prompt caching features.

  2. 2
    Article
    Avatar of systemdesignnewsSystem Design Newsletter·29w

    A Beginner’s Field Guide to Large Language Models: From Tokens to Agents

    Comprehensive beginner's guide explaining 33 fundamental LLM concepts without mathematics. Covers core mechanics like tokens, embeddings, and parameters; training processes including pre-training and fine-tuning; interaction patterns through prompts and context windows; architectural extensions like RAG and agentic AI; model types and deployment options; performance measurement through benchmarks and metrics; and common failure modes like hallucination and bias with their mitigation strategies. Emphasizes practical understanding over technical depth to help readers use LLMs effectively and recognize their limitations.

  3. 3
    Article
    Avatar of javarevisitedJavarevisited·28w

    Top 9 Books to Learn RAG and AI Agents in 2025

    A curated collection of 9 technical books for learning Retrieval-Augmented Generation (RAG) and AI agent development. Covers foundational topics like data engineering, statistics, and NLP transformers, then progresses to production-focused ML system design, LLM engineering, and frameworks like LangChain. Emphasizes practical, production-ready knowledge from industry experts including Chip Huyen's works on ML system design and AI engineering, alongside hands-on guides for building and deploying LLM-powered applications.