Best of Vector Search2025

  1. 1
    Article
    Avatar of bigdataboutiqueBigData Boutique blog·1y

    Elasticsearch vs OpenSearch - 2025 update

    An in-depth 2025 update comparing Elasticsearch and OpenSearch, touching on project status, performance, licensing, vector search capabilities, cost efficiency, and ecosystem solutions. OpenSearch has gained traction with open-source governance and additional vector search engines, while Elasticsearch maintains proprietary features and extensive integration solutions.

  2. 2
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·41w

    8 RAG Architectures for AI Engineers

    Eight different RAG (Retrieval-Augmented Generation) architectures are explained with their specific use cases: Simple Vector RAG for basic semantic matching, Multi-modal RAG for cross-modal retrieval, HyDE for handling dissimilar queries, Self-RAG for validation against trusted sources, Graph RAG for structured relationships, Hybrid RAG combining vector and graph approaches, Adaptive RAG for dynamic query handling, and Agentic RAG for complex workflows with AI agents.

  3. 3
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·34w

    A 100% Open-source Alternative to n8n!

    Sim is an open-source drag-and-drop platform for building agentic workflows that runs locally with any LLM. The article demonstrates building a finance assistant connected to Telegram using agents, MCP servers, and APIs. It also covers four RAG indexing strategies: chunk indexing (splitting documents into embedded chunks), sub-chunk indexing (breaking chunks into finer pieces while retrieving larger context), query indexing (generating hypothetical questions for better semantic matching), and summary indexing (using LLM-generated summaries for dense data).

  4. 4
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·36w

    The Open-source RAG Stack

    A comprehensive guide to building production-ready RAG systems using open-source tools. Covers the complete technology stack from frontend frameworks to data ingestion, including LLM orchestration tools like LangChain and CrewAI, vector databases like Milvus and Chroma, embedding models, and retrieval systems. Also showcases 9 practical MCP (Model Context Protocol) projects for AI engineers, ranging from local MCP clients to voice agents and financial analysts.

  5. 5
    Video
    Avatar of TechWithTimTech With Tim·1y

    How to Build a Local AI Agent With Python (Ollama, LangChain & RAG)

    Learn how to build a local AI agent using Python, LangChain, Ollama, and ChromaDB. The project demonstrates setting up an AI to query and interpret data from a CSV file, such as restaurant reviews, using retrieval augmented generation. All processes run locally without requiring external accounts or cloud services, making it a highly accessible project.

  6. 6
    Article
    Avatar of mlmMachine Learning Mastery·50w

    Implementing Vector Search from Scratch: A Step-by-Step Tutorial

    A comprehensive tutorial demonstrating how to build a vector search engine from scratch using Python. Covers the three core steps of vector search: converting text to numerical vectors, calculating similarity using cosine similarity, and retrieving the most relevant results. Includes practical code examples with NumPy and Matplotlib, visualization of vector spaces, and explains the connection to RAG systems. The implementation uses simplified word embeddings and averaging techniques to make concepts accessible while maintaining the fundamental principles of semantic search.

  7. 7
    Article
    Avatar of freecodecampfreeCodeCamp·45w

    How AI Agents Remember Things: The Role of Vector Stores in LLM Memory

    Large language models don't have inherent memory, but vector stores enable AI agents to simulate memory by converting text into numerical embeddings and storing them in specialized databases. When users interact with AI, the system searches for semantically similar stored vectors to retrieve relevant past information. Popular vector databases include FAISS for local deployments and Pinecone for cloud-based solutions. This approach, called retrieval-augmented generation (RAG), allows AI to appear contextually aware despite technical limitations around similarity-based matching and static embeddings.

  8. 8
    Article
    Avatar of newstackThe New Stack·1y

    How To Master Vector Databases

    Vector databases are specialized systems designed to handle high-dimensional data, such as images, text, and audio embeddings, effectively and efficiently. They excel in similarity searches and are integral to applications like recommendation systems, image retrieval, and anomaly detection. This guide offers insights into selecting the right vector database, understanding vector embeddings, and optimizing performance, featuring examples from popular vector databases like Milvus, Pinecone, and Weaviate.

  9. 9
    Article
    Avatar of singlestoreSingleStore·46w

    How to Build a RAG Knowledge Base in Python for Customer Support

    A comprehensive guide to building a Retrieval-Augmented Generation (RAG) system for customer support using Python, LangChain, OpenAI, and SingleStore. The tutorial covers setting up a vector database, converting documents into embeddings, implementing semantic search, and generating contextual answers. Real-world case studies show 28.6% reduction in issue resolution time. The step-by-step implementation includes environment setup, database configuration, embedding creation, and API endpoint development for instant, accurate support responses.

  10. 10
    Article
    Avatar of ergq3auoeReinier·40w

    Build & Deploy an AI-Powered Ecommerce Search Engine with Next.js 15, Hugging Face & Pinecone

    A comprehensive tutorial demonstrating how to build an AI-powered ecommerce search engine using Next.js 15, Pinecone vector database, and Hugging Face. The guide covers implementing semantic search capabilities that understand natural language queries instead of relying on exact keyword matching, creating a production-ready application with modern web technologies including TypeScript and Tailwind CSS.

  11. 11
    Article
    Avatar of javarevisitedJavarevisited·40w

    Top 5 Vector Databases to Learn in 2025 (with Courses and Books to Master Them)

    Vector databases have become essential infrastructure for AI applications in 2025, powering semantic search, RAG systems, and recommendation engines. The top 5 vector databases to learn are Pinecone (production-ready managed service), Weaviate (open-source with hybrid search), ChromaDB (lightweight local option), FAISS (industry-standard similarity search library), and Qdrant (high-performance Rust-based solution). Each database has specific strengths and learning resources including Udemy courses, Coursera programs, and technical books to help developers master these technologies for building modern GenAI applications.

  12. 12
    Article
    Avatar of meilisearchMeilisearch·40w

    9 advanced RAG techniques to know & how to implement them

    Advanced RAG techniques optimize retrieval-augmented generation systems beyond basic implementations. Nine key techniques include text chunking (semantic vs fixed-size), reranking with cross-encoders, metadata filtering, hybrid search combining keyword and vector methods, query rewriting for better intent understanding, autocut for dynamic text trimming, context distillation for focused summaries, and fine-tuning both LLMs and embedding models. These methods address common issues like noisy results, irrelevant context, and poor ranking. Implementation tools include Meilisearch for hybrid search, LangChain for workflow orchestration, Weaviate for vector search, and Pinecone for scalable vector databases. Evaluation focuses on retrieval accuracy, latency, precision-recall balance, and user satisfaction metrics.

  13. 13
    Article
    Avatar of hnHacker News·37w

    Will Amazon S3 Vectors Kill Vector Databases—or Save Them?

    AWS S3 Vectors offers 90% cost savings for vector storage but won't replace dedicated vector databases like Milvus. Instead, it fills the cold storage tier in a three-tier architecture (hot/warm/cold) that balances latency, cost, and scale. S3 Vectors excels at low-QPS scenarios and archival storage but struggles with high-performance search, frequent updates, and complex queries. The future lies in tiered vector storage where different solutions serve different performance and cost requirements.

  14. 14
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·42w

    Make RAG systems 32x Memory Efficient!

    Binary quantization can make RAG systems 32x more memory efficient by converting float32 embeddings to binary vectors. The technique involves ingesting documents, generating binary embeddings, storing them in a vector database like Milvus, and using Hamming distance for retrieval. A complete implementation demonstrates querying 36M+ vectors in under 30ms using LlamaIndex, Milvus, and Groq for inference, with deployment via Beam Cloud.

  15. 15
    Article
    Avatar of c8e54637d3ee4126a9c503737169de61Keshav Ashiya·23w

    Docify: Building a Production RAG System for Knowledge Management

    Docify is an open-source RAG system that processes documents locally while maintaining AI capabilities. The architecture uses 11 specialized services including async embedding generation with Celery, hybrid search combining pgvector and BM25, multi-factor ranking with citation verification, and token-aware context assembly. Built with PostgreSQL pgvector for vector storage, Redis for task queuing, and Ollama for local LLM inference, it supports heterogeneous document formats and implements deduplication via SHA-256 hashing. The system uses HNSW indexing for sub-200ms vector search, reciprocal rank fusion for search result merging, and citation verification to reduce hallucinations.

  16. 16
    Article
    Avatar of opensuseopenSUSE·33w

    GSoC 2025, Building a Semantic Search Engine for Any Video

    A GSoC 2025 project that built an end-to-end semantic video search engine capable of finding specific moments within videos using natural language queries. The system uses a two-part architecture: an ingestion pipeline that processes videos with AI models (TransNetV2, WhisperX, BLIP, VideoMAE) to extract shots, transcripts, captions, and actions, then segments them intelligently and enriches them with LLM-generated summaries; and a search application with FastAPI backend that performs hybrid text-visual searches using ChromaDB vector database and Reciprocal Rank Fusion for result ranking, paired with a Streamlit frontend for user interaction.

  17. 17
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·29w

    RAG vs. CAG, Explained Visually!

    Cache-Augmented Generation (CAG) improves upon traditional RAG by caching static, rarely-changing information directly in the model's key-value memory, while continuing to retrieve dynamic data from vector databases. This hybrid approach reduces redundant fetches, lowers costs, and speeds up inference by separating stable "cold" data (cacheable) from frequently updated "hot" data (retrievable). The technique is already supported by APIs like OpenAI and Anthropic through prompt caching features.

  18. 18
    Article
    Avatar of weaviateWeaviate·41w

    Elysia: Building an end-to-end agentic RAG app

    Elysia is an open-source agentic RAG framework that goes beyond traditional text-only AI assistants by using decision tree architecture, dynamic data display formats, and intelligent data analysis. Built with Python and powered by Weaviate, it features transparent decision-making processes, chunk-on-demand document processing, personalized feedback learning, and multi-model routing. The framework can be used as both a web application and Python library, offering customizable tools and real-time observability of AI reasoning processes.

  19. 19
    Article
    Avatar of meilisearchMeilisearch·1y

    Why you shouldn't use vector databases for RAG

    This post argues against the use of vector databases for retrieval augmented generation (RAG) systems, highlighting their limitations in query refinement and precision. It suggests a more effective approach using hybrid search that combines full-text and semantic capabilities, mirroring human search behaviors to improve relevance and simplicity.

  20. 20
    Article
    Avatar of meilisearchMeilisearch·40w

    10 Best RAG Tools and Platforms: Full Comparison [2025]

    A comprehensive comparison of 10 RAG tools and platforms for 2025, including Meilisearch, LangChain, RAGatouille, Verba, Haystack, Embedchain, LlamaIndex, MongoDB, Pinecone, and Vespa. Each tool is analyzed with key features, pricing, integrations, pros/cons based on user reviews, and ideal use cases. The guide covers open-source options, enterprise solutions, and search engine tools, providing selection criteria including retrieval methods, performance, scalability, integration ease, deployment options, cost considerations, and community support.

  21. 21
    Article
    Avatar of javarevisitedJavarevisited·48w

    Top 6 Udemy Courses to Learn Vector Databases for AI and LLM Projects (2025)

    Vector databases have become essential for AI applications involving large language models, recommendation engines, and retrieval-augmented generation systems. Six Udemy courses are recommended for learning vector databases like Pinecone, FAISS, ChromaDB, and Qdrant in 2025. The courses range from beginner-friendly fundamentals to advanced AI engineering, covering practical implementations with Python, LangChain integration, RAG system development, and real-world AI project building. These affordable courses provide hands-on experience with modern vector database tools and their applications in semantic search, chatbots, and GenAI products.

  22. 22
    Article
    Avatar of communityCommunity Picks·50w

    This Open Source Tool Could Save Your Data Team Hundreds of Hours

    CocoIndex introduces automatic schema inference for Qdrant vector databases, eliminating manual collection setup. The tool uses declarative dataflow programming to automatically infer and manage target schemas from flow definitions, supporting incremental processing with a high-performance Rust stack. Developers can now define data transformations in ~100 lines of Python without manually configuring collections, tables, or indexes across multiple storage systems including Postgres, Neo4j, and Kuzu.

  23. 23
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·39w

    Corrective RAG Agentic Workflow

    Corrective RAG (CRAG) enhances traditional RAG systems by adding a self-assessment step that evaluates retrieved document relevance before generating responses. The workflow searches documents, uses an LLM to assess context relevance, retains only relevant information, performs web search when needed, and aggregates context for final response generation. The implementation uses a tech stack including Firecrawl for web search, Milvus for vector storage, Beam for deployment, and LlamaIndex workflows for orchestration, with observability through CometML's Opik.

  24. 24
    Article
    Avatar of meilisearchMeilisearch·48w

    How to Build a RAG Pipeline: A Step-by-Step Guide

    RAG (Retrieval-Augmented Generation) pipelines combine search engines with large language models to provide accurate, grounded responses by retrieving relevant information before generating answers. The guide covers building a complete RAG system from data ingestion and chunking through embedding generation, vector storage with Meilisearch, and integration with generative models. Key considerations include choosing appropriate tools, optimizing chunking strategies, monitoring performance, managing costs, and implementing security measures for production deployments.

  25. 25
    Article
    Avatar of aiAI·1y

    Want AI to Actually Understand Your Code? This Tool Says It Can Help

    CocoIndex is a tool designed to index and query your codebase, facilitating the construction of a data pipeline. It uses Tree-sitter to intelligently chunk code based on syntax and provides built-in Rust integration. The process involves reading code files, extracting extensions, chunking code, generating embeddings, and storing them in a vector database. CocoIndex supports various languages and allows users to embed code using models from Hugging Face. It leverages Postgres for managing the data index, with plans to support other databases.