Best of RAG — 2025

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·50w
9 MCP Projects for AI Engineers
A comprehensive collection of 9 Model Control Protocol (MCP) projects designed for AI engineers, covering various applications from local MCP clients and agentic RAG systems to voice agents and synthetic data generators. The projects demonstrate how to integrate MCP with popular tools like Claude Desktop and Cursor IDE, enabling developers to build more sophisticated AI applications with enhanced tool connectivity and context sharing capabilities.
614
12
2
Article
GoPenAI·1y
How to Build a Local RAG with DeepSeek-R1, LangChain, and Ollama (Step-by-Step Guide)
Learn how to build a local Retrieval-Augmented Generation (RAG) system using DeepSeek-R1, LangChain, and Ollama. This guide details the installation, setup, and deployment of a RAG pipeline that processes PDFs locally, ensuring data privacy, cost efficiency, and customizability. The solution utilizes ChromaDB for document retrieval and Streamlit for a user-friendly interface.
589
3
3
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
9 RAG, LLM, and AI Agent Cheat Sheets
This post provides visual cheat sheets for AI engineers covering various topics, including Transformer vs. Mixture of Experts in LLMs, fine-tuning techniques, RAG vs Agentic RAG, strategies for chunking in RAG, levels of agentic AI systems, and more. These resources are designed to help cultivate essential skills for developing impactful AI and ML systems in the industry.
334
1
4
Article
SwirlAI·1y
The evolution of Modern RAG Architectures.
The post delves into the evolution of Retrieval Augmented Generation (RAG) architectures, discussing their development from Naive RAG to advanced techniques, including Cache Augmented Generation (CAG) and Agentic RAG. It highlights the challenges addressed at each stage, advanced methods to improve accuracy, and the potential future advancements in RAG systems.
308
5
5
Article
Medium·49w
How to Build Production Ready AI Agents in 5 Steps
A comprehensive 5-step guide for building production-ready AI agents, covering Python foundations with FastAPI and async programming, implementing robust testing and logging, mastering RAG for knowledge retrieval, designing scalable agent architectures with frameworks like LangGraph, and establishing continuous monitoring and improvement processes. The guide emphasizes moving beyond prototype demos to create reliable, maintainable systems that can handle real-world production environments.
277
5
6
Article
ByteByteGo·49w
EP167: Top 20 AI Concepts You Should Know
A comprehensive overview of 20 essential AI concepts including machine learning, deep learning, neural networks, NLP, computer vision, and transformers. Also covers the AI application stack for building RAG applications, featuring components like large language models, frameworks, vector databases, data extraction tools, and text embeddings. Additionally includes insights into Shopify's tech stack architecture and job opportunities in AI and software engineering.
204
1
7
Article
SwirlAI·51w
Breaking into AI Engineering in 2025.
A comprehensive roadmap for becoming an AI Engineer in 2025, covering essential skills from Python fundamentals and LLM APIs to advanced topics like AI agents, RAG systems, and observability. The guide emphasizes learning fundamentals while building practical skills, starting with basic LLM integration and progressing through vector databases, prompt engineering, agentic systems, infrastructure deployment, and security considerations. Key recommendations include mastering FastAPI and Pydantic, understanding different LLM types and structured outputs, implementing RAG with proper data preprocessing, and learning agent design patterns like ReAct and task decomposition.
204
2
8
Article
Daily Dose of Data Science | Avi Chawla | Substack·40w
8 RAG Architectures for AI Engineers
Eight different RAG (Retrieval-Augmented Generation) architectures are explained with their specific use cases: Simple Vector RAG for basic semantic matching, Multi-modal RAG for cross-modal retrieval, HyDE for handling dissimilar queries, Self-RAG for validation against trusted sources, Graph RAG for structured relationships, Hybrid RAG combining vector and graph approaches, Adaptive RAG for dynamic query handling, and Agentic RAG for complex workflows with AI agents.
172
2
9
Article
freeCodeCamp·23w
Learn n8n to Design, Develop, and Deploy Production-Grade AI Agents
n8n is an open-source visual workflow automation tool for connecting applications, APIs, and AI models. A comprehensive beginner course covers building practical AI agents including email automation, research workflows with OpenAI and Perplexity, and a customer support RAG agent using vector databases like Pinecone. The training includes advanced topics like modular component patterns, multi-workflow builds for coordinating agent teams, and deployment options including cloud, Docker, and self-hosting with local LLMs like Ollama.
170
2
10
Article
Daily Dose of Data Science | Avi Chawla | Substack·34w
A 100% Open-source Alternative to n8n!
Sim is an open-source drag-and-drop platform for building agentic workflows that runs locally with any LLM. The article demonstrates building a finance assistant connected to Telegram using agents, MCP servers, and APIs. It also covers four RAG indexing strategies: chunk indexing (splitting documents into embedded chunks), sub-chunk indexing (breaking chunks into finer pieces while retrieving larger context), query indexing (generating hypothetical questions for better semantic matching), and summary indexing (using LLM-generated summaries for dense data).
156
7
11
Article
Community Picks·1y
Building a Local RAG Chat App with Reflex, LangChain, Huggingface, and Ollama
Learn how to build a privacy-focused RAG-powered chat app using Reflex, LangChain, Hugging Face, FAISS, and Ollama. This step-by-step guide covers setting up a local environment, creating an interactive chat UI, embedding search, and integrating local LLM, eliminating cloud dependencies and frontend expertise.
143
12
Article
DigitalOcean Community·46w
LangChain Explained: The Ultimate Framework for Building LLM Applications
LangChain is an open-source Python framework that simplifies building LLM applications by providing standard interfaces for chat models, embeddings, and vector stores. It offers key components like chains for sequential operations, agents for autonomous decision-making, memory for conversation context, tools for external integrations, and vector stores for retrieval-augmented generation. The framework abstracts away complexity when connecting LLMs to external data sources and APIs, making it easier to build chatbots, question-answering systems, and other AI applications without reinventing common functionality.
142
13
Article
Daily Dose of Data Science | Avi Chawla | Substack·36w
The Open-source RAG Stack
A comprehensive guide to building production-ready RAG systems using open-source tools. Covers the complete technology stack from frontend frameworks to data ingestion, including LLM orchestration tools like LangChain and CrewAI, vector databases like Milvus and Chroma, embedding models, and retrieval systems. Also showcases 9 practical MCP (Model Context Protocol) projects for AI engineers, ranging from local MCP clients to voice agents and financial analysts.
135
14
Article
Daily Dose of Data Science | Avi Chawla | Substack·21w
The AI Engineering Guidebook
A comprehensive 350+ page guidebook covering the engineering fundamentals of LLM systems, including model architecture, training, prompt engineering, RAG systems, fine-tuning techniques like LoRA, AI agents, Model Context Protocol, optimization strategies, and deployment considerations. The resource focuses on practical engineering decisions, system design tradeoffs, and real-world implementation patterns rather than surface-level usage.
125
5
15
Article
ByteByteGo·47w
EP169: RAG vs Agentic RAG
RAG (Retrieval Augmented Generation) combines information retrieval with large language models, but traditional RAG has limitations in adaptability and real-time processing. Agentic RAG introduces AI agents that make decisions, select tools, and refine queries for more accurate responses. The comparison covers Kubernetes fundamentals including control planes, nodes, and key resources like Pods and Deployments. Six space-efficient data structures are highlighted: Bloom Filter, HyperLogLog, Cuckoo Filter, Minhash, SkipList, and Count-Min Sketch. Database normalization forms from 1NF to 4NF are explained for eliminating redundancy and enforcing data integrity.
121
1
16
Article
Machine Learning Mastery·50w
Implementing Vector Search from Scratch: A Step-by-Step Tutorial
A comprehensive tutorial demonstrating how to build a vector search engine from scratch using Python. Covers the three core steps of vector search: converting text to numerical vectors, calculating similarity using cosine similarity, and retrieving the most relevant results. Includes practical code examples with NumPy and Matplotlib, visualization of vector spaces, and explains the connection to RAG systems. The implementation uses simplified word embeddings and averaging techniques to make concepts accessible while maintaining the fundamental principles of semantic search.
118
17
Article
Hacker News·47w
The New Skill in AI is Not Prompting, It's Context Engineering
Context Engineering emerges as a more comprehensive approach than prompt engineering for building effective AI agents. Rather than focusing solely on crafting perfect prompts, it involves designing dynamic systems that provide LLMs with the right information, tools, and format at the right time. The concept encompasses system prompts, user inputs, conversation history, long-term memory, retrieved information (RAG), available tools, and structured outputs. The key difference between basic and sophisticated AI agents lies not in code complexity but in context quality - successful agents gather comprehensive contextual information before generating responses, while failures often stem from inadequate context rather than model limitations.
110
2
18
Article
Javarevisited·44w
Top 5 Books to Learn LLMs (Large Language Models) in Depth
A curated list of five essential books for learning Large Language Models in depth, covering everything from basic engineering concepts to production deployment. The recommendations include practical guides for building LLM applications, training models from scratch, and deploying them at scale. Each book targets different aspects of LLM development, from foundational architecture and prompt engineering to production monitoring and evaluation strategies.
109
3
19
Article
BigData Boutique blog·1y
Building RAG Apps Without the Bloat: Meet Shraga
Discover Shraga, an open-source framework designed by BigData Boutique to simplify and scale GenAI applications without the overhead of popular RAG frameworks. Shraga offers minimal boilerplate, supports multiple LLM providers, and provides reusable utilities for EDA, data cleaning, and embedding. It features modular flows for easy debugging and a FastAPI layer for quick deployments. Essential components like DocHandler and BaseEmbedder ensure efficient data ingestion and embedding, while the Shraga-UI provides a ready-to-use chat interface. This setup enables fast prototyping and production-ready GenAI systems.
107
20
Article
Daily Dose of Data Science | Avi Chawla | Substack·48w
10 MCP, RAG and AI Agents Projects
A curated collection of 10 advanced AI engineering projects covering MCP-powered applications, RAG systems, and AI agents. Projects include video RAG with exact timestamp retrieval, corrective RAG with self-assessment, multi-agent flight booking systems, voice-enabled RAG agents, and local alternatives to ChatGPT's research features. The repository contains 70+ hands-on tutorials focusing on real-world implementations of LLMs, memory-enabled agents, multimodal document processing, and performance optimization techniques like binary quantization for 40x faster RAG systems.
99
21
Article
freeCodeCamp·44w
How AI Agents Remember Things: The Role of Vector Stores in LLM Memory
Large language models don't have inherent memory, but vector stores enable AI agents to simulate memory by converting text into numerical embeddings and storing them in specialized databases. When users interact with AI, the system searches for semantically similar stored vectors to retrieve relevant past information. Popular vector databases include FAISS for local deployments and Pinecone for cloud-based solutions. This approach, called retrieval-augmented generation (RAG), allows AI to appear contextually aware despite technical limitations around similarity-based matching and static embeddings.
98
22
Video
YouTube·1y
I Built the ULTIMATE n8n RAG AI Agent Template
The post discusses the implementation and enhancement of a Retrieval-Augmented Generation (RAG) AI agent template using n8n, a no-code tool. The author explains the limitations of typical RAG setups, particularly in handling context and data analysis, and introduces an improved agentic RAG solution. The enhanced RAG agent can reason about how to explore knowledge bases, handle different file types, and execute complex queries. The post also includes a guide on setting up the workflow in n8n and integrating tools like Google Drive and Superbase for better data management.
94
23
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
[Hands-on] RAG Over GitHub Repos
Ragie Connect offers a comprehensive infrastructure for building RAG applications over user data by handling authentication, authorization, and syncing from sources like Google Drive and Salesforce. This guide demonstrates how to create a RAG app over GitHub repositories using GitIngest to parse the repo and Llama-3.2 as the LLM. The process involves parsing the GitHub repo, setting up the LLM, embedding the data, and creating an index for interaction, with a neat interface and a promise for more advanced techniques in future guides.
83
1
24
Article
SingleStore·45w
How to Build a RAG Knowledge Base in Python for Customer Support
A comprehensive guide to building a Retrieval-Augmented Generation (RAG) system for customer support using Python, LangChain, OpenAI, and SingleStore. The tutorial covers setting up a vector database, converting documents into embeddings, implementing semantic search, and generating contextual answers. Real-world case studies show 28.6% reduction in issue resolution time. The step-by-step implementation includes environment setup, database configuration, embedding creation, and API endpoint development for instant, accurate support responses.
81
25
Article
Machine Learning Mastery·47w
5 Advanced RAG Architectures Beyond Traditional Methods
Five advanced RAG architectures that go beyond traditional retrieval-generation pipelines: Dual-Encoder Multi-Hop Retrieval breaks down complex queries into layered searches; Context-Aware Feedback Loops enable iterative self-improvement through confidence evaluation; Modular Memory-Augmented RAG maintains persistent, contextual memory across sessions; Agentic RAG integrates tool usage for active reasoning and real-time data processing; and Graph-Structured Context Retrieval uses knowledge graphs to find interconnected information rather than simple similarity matches.
78

See all RAG archives