Best of RAG — 2024

1
Article
KDnuggets·2y
Free AI Courses from NVIDIA: For All Levels
Free AI courses from NVIDIA are available to help you learn and build AI applications. Topics include Generative AI, building a neural network, augmenting LLMs using Retrieval Augmented Generation (RAG), and building RAG Agents with LLMs.
1.5K
50
2
Article
freeCodeCamp·2y
Learn RAG Fundamentals and Advanced Techniques
Learn about Retrieval-Augmented Generation (RAG) through a comprehensive course by Paulo Dichone on the freeCodeCamp.org YouTube channel. The course covers fundamental concepts, system building, advanced techniques like query expansion, and hands-on projects. By the end, you'll be equipped with the knowledge and skills to build and enhance RAG systems.
147
1
3
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
A Crash Course on Building RAG Systems – Part 4
Part 4 of the crash course on building RAG systems focuses on implementing RAG on multimodal data, specifically complex documents with tables, texts, and images. This series covers foundational components, evaluation methods, optimization techniques, and handling large data sets, making it highly beginner-friendly. Understanding how to build reliable RAG systems can reduce costs and enhance scalability for enterprises, bypassing the need for fine-tuning large language models (LLMs).
118
4
Article
freeCodeCamp·2y
Mastering RAG from Scratch
Learn how to implement Retrieval-Augmented Generation (RAG) from scratch with an in-depth course on the freeCodeCamp.org YouTube channel. RAG combines retrieval systems with advanced natural language generation and is valuable in chatbot development and other fields.
92
5
Article
Towards AI·2y
The Best Practices of RAG
Explores the process of retrieval-augmented generation (RAG) and outlines best practices for its various components. Discusses query classification, efficient document retrieval, re-ranking for relevance, re-packing into structured formats, and summarization to extract key information. The post also provides a comprehensive evaluation of these practices and concludes with insights and recommendations.
90
3
6
Article
Machine Learning News·2y
Korvus: An All-in-One Open-Source RAG (Retrieval-Augmented Generation) Pipeline Built for Postgres
Korvus aims to simplify the Retrieval-Augmented Generation (RAG) pipeline by executing the entire process within a Postgres database using PostgresML. This approach eliminates the need for multiple external tools, reduces development complexity, and improves efficiency by leveraging in-database machine learning for tasks like embedding generation and data retrieval. Korvus supports multiple programming languages, facilitating easier integration and maintenance of search applications, although its performance metrics are yet to be quantified.
89
7
Article
GoPenAI·2y
Building an Effective RAG Pipeline: A Guide to Integrating Self-RAG, Corrective RAG, and Adaptive RAG
A comprehensive guide to building an effective Retrieval-Augmented Generation (RAG) pipeline by integrating Self-RAG, Corrective RAG, and Adaptive RAG. This pipeline aims to intelligently handle questions of varying complexity, ensure information accuracy, and generate useful answers. It leverages LangGraph for stateful, multi-agent workflows, and includes methods for routing questions, retrieving documents, evaluating relevance, and grading output quality.
78
2
8
Article
Daily Dose of Data Science | Avi Chawla | Substack·2y
5 Chunking Strategies For RAG
Chunking is a critical step in designing a Retrieval-Augmented Generation (RAG) application as it enhances the efficiency and accuracy of the retrieval process. The post discusses five chunking strategies: fixed-size, semantic, recursive, document structure-based, and LLM-based chunking. Each method has its unique benefits and trade-offs, focusing on maintaining semantic integrity and computational efficiency. The choice of technique depends on document structure, model capabilities, and computational resources.
74
1
9
Article
GoPenAI·2y
Build an Advanced RAG App: Query Routing
This post explores how to build an advanced RAG application using a technique called Query Routing. Query Routing enables the application to make decisions based on a user's query, selecting the most appropriate action from predefined choices such as retrieving context from multiple data sources, using different indexes, or performing a web search. Various types of Query Routers are discussed, including LLM Selector Router, LLM Function Calling Router, Semantic Router, and more. Example implementations demonstrate how to create Query Routers and enhance the decision-making capabilities of RAG applications.
71
3
10
Article
Weaviate·2y
What is Agentic RAG
Agentic RAG is an advanced AI framework enhancing the traditional Retrieval-Augmented Generation (RAG) pipelines by incorporating AI agents. These agents possess memory, planning, and tool capabilities to perform various actions beyond simple information retrieval. The architecture can range from single-agent systems acting as routers to complex multi-agent setups coordinating multiple specialists. This approach addresses the limitations of vanilla RAG by providing tools, multi-step retrieval, and validation, thereby improving response accuracy and robustness, while introducing potential latency and reliability issues inherent to LLMs.
65
11
Article
Machine Learning News·2y
Chunking Techniques for Retrieval-Augmented Generation (RAG): A Comprehensive Guide to Optimizing Text Segmentation
Retrieval-Augmented Generation (RAG) enhances information retrieval and contextual text generation by combining generative models with retrieval techniques. Crucial to RAG's performance is how text data is segmented or 'chunked'. Various chunking methods—Fixed-Length, Sentence-Based, Paragraph-Based, Recursive, Semantic, Sliding Window, and Document-Based—each offer unique benefits and limitations. Choosing the appropriate chunking technique can significantly impact the efficacy of RAG, depending on factors like text nature, application requirements, and computational efficiency.
42
12
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
[Hands-on] Tool calling in LLMs
Tool calling allows language models to perform specific tasks by invoking external tools or APIs. The process involves recognizing when an external tool is needed, invoking the tool, and integrating its output into the model's response. This enhances the flexibility and capability of LLMs. A demo is provided to build a stock price retrieval assistant using the yfinance library.
41
1
13
Article
Machine Learning News·2y
AutoRAG: An Automated Tool for Optimizing Retrieval-Augmented Generation Pipelines
AutoRAG is a tool designed to optimize Retrieval-Augmented Generation (RAG) pipelines by evaluating various RAG modules with self-evaluation data to identify the best configuration for specific use cases. It automates data creation, performs optimization experiments, and supports deployment using a single YAML file. AutoRAG structures the pipeline into interconnected nodes and uses synthetic data from large language models (LLMs) for effective evaluation. Currently in its alpha phase, it shows promising potential for future development.
41
14
Article
Towards AI·2y
Improving RAG Answer Quality Through Complex Reasoning
Multi-hop retrieval enhances the capabilities of Retrieval-Augmented Generation (RAG) systems by enabling complex reasoning over multiple pieces of information. This method is especially powerful for advanced question-answering systems. The post demonstrates building a Q&A chatbot for the healthcare domain using Indexify, OpenAI, and DSPy, showcasing how multi-hop retrieval can significantly improve answer quality in complex queries.
40
1
15
Article
Hacker News·2y
vanna-ai/vanna: 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
Vanna is an open-source Python framework for SQL generation using RAG. It allows users to train a model and ask questions to generate SQL queries for their database. The framework provides high accuracy, security, self-learning capabilities, and supports any SQL database. Users can also extend Vanna to use their own LLM or vector database.
39
16
Article
Community Picks·2y
Building an Advanced RAG System With Self-Querying Retrieval
Learn how to build an advanced Retrieval Augmented Generation (RAG) system that leverages self-querying retrieval to improve search relevance. This tutorial covers extracting metadata filters from natural language queries, combining metadata filtering with vector search, and generating structured outputs using LLMs. The guide focuses on developing an investment assistant to answer financial questions using MongoDB as the vector store and LangGraph for orchestration.
37
17
Article
Community Picks·2y
How We Saved 10s of Thousands of Dollars Deploying Low Cost Open Source AI Technologies At Scale with Kubernetes
Learn how to deploy low-cost open source AI technologies at scale with Kubernetes for generative AI applications, using alternatives to OpenAI and running vLLM locally and on Kubernetes.
36
4
18
Article
KDnuggets·2y
Llama, Llama, Llama: 3 Simple Steps to Local RAG with Your Content
Learn how to build a local RAG system using Ollama, Llama 3, and LlamaIndex in just 3 simple steps.
35
19
Article
Hacker News·2y
infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
RAGFlow is an open-source RAG engine based on deep document understanding. It offers a streamlined workflow for businesses, supports various data formats, and provides truthful question-answering capabilities.
33
1
20
Article
GoPenAI·2y
Dynamic Routing in RAG: Directing User Queries to the Right Vector Store with Open Source Models
Generative AI applications can be optimized by integrating a semantic routing mechanism in the Retrieval-Augmented Generation (RAG) framework. This involves analyzing user queries and directing them to the most relevant vector stores, enhancing both accuracy and efficiency. The post demonstrates implementing a semantic router using a Nomic embedding model and Llama 3.1 for embeddings, covering machine learning, computer science, and economics topics. Advanced techniques like Multi-query translation and HyDE further refine the process, ensuring users receive pertinent information from diverse sources.
31
21
Article
Towards Data Science·2y
Improving RAG Answer Quality Through Complex Reasoning
Explore how multi-hop retrieval can enhance the quality of answers in Retrieval-Augmented Generation (RAG) systems, particularly in complex reasoning tasks. Using DSPy and Indexify, the post demonstrates the construction of a question-answering chatbot for the healthcare domain. The setup includes the installation of necessary packages, data ingestion, and creating multi-hop retrieval logic for efficient question handling. The integration allows for dynamic context generation, deduplication, and chain-of-thought reasoning, showcasing significant improvements in handling complex queries.
27
1
22
Article
InfoWorld·2y
Using PostgreSQL as a vector database in RAG
Learn how to build a local retrieval-augmented generation (RAG) application using PostgreSQL with the pgvector extension, Ollama, and the Llama 3 large language model. This guide describes how Postgres can store both vector and tabular data, making it a versatile option for medium-sized RAG applications. It covers setting up a vector database, ingesting text from multiple sources, conducting similarity searches, and querying a large language model to generate answers. Practical coding examples and step-by-step instructions are provided to help developers get started quickly.
26
23
Article
freeCodeCamp·2y
How to Build a RAG Chatbot with Agent Cloud and Google Sheets
Learn how to build a Retrieval-Augmented Generation (RAG) chatbot using Agent Cloud and Google Sheets. Understand the complexities of setting up RAG chat applications and how Agent Cloud simplifies the process through automated data retrieval, natural language processing, and scalable infrastructure. This guide covers everything from setting up Agent Cloud via Docker, adding models, creating a Google Cloud Platform service account key, enabling the Google Sheets API, and building an interactive chat application to communicate with data sourced from Google Sheets.
26
24
Article
neo4j·2y
Get Started With GraphRAG: Neo4j’s Ecosystem Tools
Neo4j’s GraphRAG Ecosystem Tools provide open-source resources to enhance GenAI applications using knowledge graphs. GraphRAG addresses issues like hallucination and lack of domain-specific context by combining retrieval-augmented generation with structured and semi-structured data. The tools include the LLM Knowledge Graph Builder for transforming unstructured text into knowledge graphs, and NeoConverse for generating Cypher graph queries from natural language questions. These tools integrate seamlessly with various programming languages and frameworks, making it easier to build and optimize GenAI applications.
26
25
Article
GoPenAI·2y
Refining RAG Accuracy with TrueLens: An Evaluation Guide
In today's AI landscape, Retrieval-Augmented Generation (RAG) significantly enhances Large Language Models (LLMs) by leveraging user-specific data for context-driven responses. To ensure quality, rigorous evaluation frameworks like TruLens are essential. This guide explores the use of TruLens's feedback functions to assess context relevance, groundedness, and answer relevance, helping to improve RAG pipelines by minimizing risks such as hallucinations and biases. The step-by-step instructions illustrate how to set up and evaluate a RAG pipeline, ensuring consistency and high performance in AI-driven responses.
25
3

See all RAG archives