NVIDIA has developed a new system architecture for question-and-answer workflows using retrieval-augmented generation (RAG). They found that users want more than just RAG-driven tasks, appreciating features like web search and summarization. By integrating Perplexity's search API, LlamaIndex, NVIDIA NIM microservices, and Chainlit, they created a versatile chat application. The post provides detailed instructions on setting up and deploying this system, highlighting the ease of development with NVIDIA's tools.

11m read timeFrom developer.nvidia.com
Post cover image
Table of contents
NIM microservices for LLM deploymentSetting up the project environment, dependencies, and installationBuilding the core functionalityExtra featuresExplore advanced chat functionality with the NVIDIA and LlamaIndex Developer Contest

Sort: