Best of Vector Search — August 2025

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·41w
8 RAG Architectures for AI Engineers
Eight different RAG (Retrieval-Augmented Generation) architectures are explained with their specific use cases: Simple Vector RAG for basic semantic matching, Multi-modal RAG for cross-modal retrieval, HyDE for handling dissimilar queries, Self-RAG for validation against trusted sources, Graph RAG for structured relationships, Hybrid RAG combining vector and graph approaches, Adaptive RAG for dynamic query handling, and Agentic RAG for complex workflows with AI agents.
172
2
2
Article
Reinier·40w
Build & Deploy an AI-Powered Ecommerce Search Engine with Next.js 15, Hugging Face & Pinecone
A comprehensive tutorial demonstrating how to build an AI-powered ecommerce search engine using Next.js 15, Pinecone vector database, and Hugging Face. The guide covers implementing semantic search capabilities that understand natural language queries instead of relying on exact keyword matching, creating a production-ready application with modern web technologies including TypeScript and Tailwind CSS.
76
3
Article
Javarevisited·40w
Top 5 Vector Databases to Learn in 2025 (with Courses and Books to Master Them)
Vector databases have become essential infrastructure for AI applications in 2025, powering semantic search, RAG systems, and recommendation engines. The top 5 vector databases to learn are Pinecone (production-ready managed service), Weaviate (open-source with hybrid search), ChromaDB (lightweight local option), FAISS (industry-standard similarity search library), and Qdrant (high-performance Rust-based solution). Each database has specific strengths and learning resources including Udemy courses, Coursera programs, and technical books to help developers master these technologies for building modern GenAI applications.
60
4
Article
Meilisearch·40w
9 advanced RAG techniques to know & how to implement them
Advanced RAG techniques optimize retrieval-augmented generation systems beyond basic implementations. Nine key techniques include text chunking (semantic vs fixed-size), reranking with cross-encoders, metadata filtering, hybrid search combining keyword and vector methods, query rewriting for better intent understanding, autocut for dynamic text trimming, context distillation for focused summaries, and fine-tuning both LLMs and embedding models. These methods address common issues like noisy results, irrelevant context, and poor ranking. Implementation tools include Meilisearch for hybrid search, LangChain for workflow orchestration, Weaviate for vector search, and Pinecone for scalable vector databases. Evaluation focuses on retrieval accuracy, latency, precision-recall balance, and user satisfaction metrics.
55
5
Article
Daily Dose of Data Science | Avi Chawla | Substack·42w
Make RAG systems 32x Memory Efficient!
Binary quantization can make RAG systems 32x more memory efficient by converting float32 embeddings to binary vectors. The technique involves ingesting documents, generating binary embeddings, storing them in a vector database like Milvus, and using Hamming distance for retrieval. A complete implementation demonstrates querying 36M+ vectors in under 30ms using LlamaIndex, Milvus, and Groq for inference, with deployment via Beam Cloud.
54
1
6
Article
Weaviate·41w
Elysia: Building an end-to-end agentic RAG app
Elysia is an open-source agentic RAG framework that goes beyond traditional text-only AI assistants by using decision tree architecture, dynamic data display formats, and intelligent data analysis. Built with Python and powered by Weaviate, it features transparent decision-making processes, chunk-on-demand document processing, personalized feedback learning, and multi-model routing. The framework can be used as both a web application and Python library, offering customizable tools and real-time observability of AI reasoning processes.
37
7
Article
Meilisearch·40w
10 Best RAG Tools and Platforms: Full Comparison [2025]
A comprehensive comparison of 10 RAG tools and platforms for 2025, including Meilisearch, LangChain, RAGatouille, Verba, Haystack, Embedchain, LlamaIndex, MongoDB, Pinecone, and Vespa. Each tool is analyzed with key features, pricing, integrations, pros/cons based on user reviews, and ideal use cases. The guide covers open-source options, enterprise solutions, and search engine tools, providing selection criteria including retrieval methods, performance, scalability, integration ease, deployment options, cost considerations, and community support.
30
8
Article
Daily Dose of Data Science | Avi Chawla | Substack·39w
Corrective RAG Agentic Workflow
Corrective RAG (CRAG) enhances traditional RAG systems by adding a self-assessment step that evaluates retrieved document relevance before generating responses. The workflow searches documents, uses an LLM to assess context relevance, retains only relevant information, performs web search when needed, and aggregates context for final response generation. The implementation uses a tech stack including Firecrawl for web search, Milvus for vector storage, Beam for deployment, and LlamaIndex workflows for orchestration, with observability through CometML's Opik.
26
9
Article
Windows Blogs·40w
Copilot on Windows: Semantic Search and new homepage begin rolling out to Windows Insiders
Microsoft is rolling out a new Copilot app update for Windows Insiders that introduces semantic file search for Copilot+ PCs and a redesigned homepage. The semantic search feature allows users to find files using natural language queries like "find images of bridges at sunset" instead of exact file names. The new homepage displays recent apps, files, and conversations, making it easier to access guided help and interact with documents. The update references the Windows Recent folder to show recently opened files and only processes files when explicitly shared by users.
18
10
Article
Daily Dose of Data Science | Avi Chawla | Substack·42w
Build a Multimodal Agentic RAG
A comprehensive guide to building a multimodal agentic RAG system that processes both documents and audio files using speech input. The tutorial covers the complete workflow from data ingestion and audio transcription with AssemblyAI, to embedding storage in Milvus vector database, and orchestration with CrewAI Flows. The system allows users to query information using voice commands, with agents retrieving relevant context and generating cited responses. The implementation includes deployment using Beam for serverless containers and a Streamlit interface for user interaction.
16

See all Vector Search archives