Best of Vector SearchJuly 2024

  1. 1
    Article
    Avatar of communityCommunity Picks·2y

    A Broke B**ch’s Guide to Tech Start-up: Choosing Vector Database — Part 1 Self-Hosted

    Vector databases are crucial for GenAI applications, offering augmented knowledge bases for language models with support for fuzzy searching using text or media embeddings. The post evaluates various self-hosted vector databases like MongoDB, ChromaDB, Weaviate, Milvus, Neo4j, KDB.AI, PostgreSQL, and SQLite. Recommendations include using Docker for ease of setup and highlighting the benefits and limitations of each option. The guide emphasizes starting with self-hosted instances to control costs while prototyping and suggests evaluating multiple databases to find the optimal setup for your application.

  2. 2
    Article
    Avatar of rubylaRUBYLAND·2y

    Building a Personal RAG Application for PDF-Based Question Answering

    Learn about building a Retrieval-Augmented Generation (RAG) application for PDF-based question answering using LLMs, embedding models, and vector databases. This guide utilizes Meta's Llama3, Qdrant VectorDB, and Llama Index for embedding with Python, providing a way to interact with PDF content through natural, conversational queries.

  3. 3
    Article
    Avatar of communityCommunity Picks·2y

    Semantic search using OpenAI, pg_embedding and Neon

    Learn how to build a semantic search app using OpenAI, Neon, and pg_embedding. The app transforms user queries into vector embeddings to perform vector similarity searches, retrieving the most relevant results based on meaning instead of keyword matches. The methodology includes generating embeddings, storing them in a Postgres database using pg_embedding, and retrieving similar items through vector similarity search. Step-by-step instructions and code are included for building the app, from gathering data to deploying the frontend and API.

  4. 4
    Article
    Avatar of taiTowards AI·2y

    Prompt Like a Pro Using DSPy: A Guide to Build a Better Local RAG Model using DSPy, Qdrant, and Ollama

    Manual prompting in AI is becoming obsolete, with DSPy offering a revolutionary approach to optimize language model (LM) prompts algorithmically. DSPy uses signatures, modules, metrics, and optimizers to attain consistent and reproducible results across different LMs. This guide details the step-by-step process of integrating DSPy with Qdrant for vector databases and Ollama for local LLM deployments. Highlights include dataset loading, creating a vector database, and implementing a Chain of Thought Reasoning with a RAG model.

  5. 5
    Article
    Avatar of communityCommunity Picks·2y

    Optimizing vector search performance with pgvector

    pgvector is a Postgres extension for vector similarity search, widely used in AI-powered applications. The post discusses the use of sequential scans and the Inverted File Index (ivfflat) for optimizing vector search performance. Sequential scans guarantee 100% recall but can be costly with larger datasets. ivfflat provides a faster, approximate nearest neighbor search by partitioning the dataset into clusters, which can be fine-tuned using lists and probes parameters. Experimentation with these parameters is crucial for achieving optimal search performance tailored to specific datasets.

  6. 6
    Article
    Avatar of hnHacker News·2y

    arunsupe/semantic-grep: grep for words with similar meaning to the query

    sgrep is a command-line tool designed for semantic searches on text using word embeddings. It finds semantically similar matches to a query rather than just string matches. Key features include configurable similarity threshold, context display, color-coded output, and support for reading from files or standard input. It's configurable via JSON or command-line arguments and requires a Word2Vec model. Installation is possible via binary download or building from source. Contributions are welcome under the MIT license.

  7. 7
    Article
    Avatar of communityCommunity Picks·2y

    pgvector: 30x Faster Index Build for your Vector Embeddings

    The new pgvector 0.6.0 extension for Postgres significantly improves HNSW index build times by up to 30x due to a parallel index build feature. With Neon's elastic capabilities, users can scale their database resources for efficient AI applications. pgvector is crucial for semantic search and Retrieval Augmented Generation (RAG) applications, enhancing the long-term memory of large language models. Key Postgres settings such as `maintenance_work_mem` and `max_parallel_maintenance_workers` can further optimize index build performance.