A step-by-step guide to building a RAG application on Vespa Cloud using the out-of-the-box RAG Blueprint. The setup combines hybrid retrieval (BM25 + vector search with binary-quantized embeddings) with multiple ranking profiles including LightGBM/GBDT for high-quality context selection. The guide covers deploying the blueprint via the Vespa Cloud console, installing NyRAG (a Python tool that handles data ingestion, chunking, embedding, and a chat UI), configuring credentials, indexing local documents or web pages, and querying via a chat interface. Four query profiles are explained: hybrid, hybrid-with-gbdt, deepresearch, and deepresearch-with-gbdt, each offering different tradeoffs between speed and retrieval quality.

16m read timeFrom blog.vespa.ai
Post cover image
Table of contents
The Challenge: The Quality of the Context WindowThe Solution: Out-of-the-Box RAG on Vespa CloudDeploy Vespa RAG Blueprint to Vespa CloudBehind the Scenes: What You Just DeployedChat with Your DataBonus: Try Web Crawling ModeTroubleshootingConclusion

Sort: