Vector search is a technique used in RAG systems to find relevant documents from large collections before feeding them to an LLM. Text is embedded into high-dimensional numerical vectors using a transformer-based model, and cosine similarity is used to find semantically similar passages. A practical demo shows chunking a 170-page NIST key management PDF, embedding all chunks into a ChromaDB vector database, querying with a cryptographic question, and using a Mistral 7B model to answer based only on retrieved context. The approach handles typos and paraphrasing gracefully, and when combined with a strict prompt, the model can correctly say 'I don't know' when the answer isn't in the retrieved documents.
•20m watch time
Sort: