In this video I go through setting up a basic fully local RAG system with Ollama 2 and the new Gemma 2 model.

Code : https://github.com/samwit/langchain-tutorials/tree/main/2024/gemma2_local_rag

Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: https://drp.li/dIMes

👨‍💻Github:
https://github.com/samwit/langchain-tutorials (updated)
https://github.com/samwit/llm-tutorials

⏱️Time Stamps:
00:00 Intro
00:09 Ollama: Gemma 2 Model
01:35 Demo: RAG running locally

Sam Witteveen AI is a publication offering insights, tutorials, and resources for artificial intelligence (AI) enthusiasts and practitioners. Readers can learn about machine learning algorithms, deep learning frameworks, and AI applications. With tutorials, case studies, and expert interviews, Sam Witteveen AI provides  guidance and expertise for building and deploying AI solutions.

Sam Witteveen

Gemma 2 has been released for multiple formats including Keras, PyTorch, and Hugging Face transformers. This post details the author's experience using the 9B and 27B models in Ollama, highlighting the better performance of the 9B model for real-time responses. A straightforward script is provided to create a fully local Retrieval-Augmented Generation (RAG) system using Gemma 2, Nomic embeddings, and ChromaDB, all executed within VSCode. The steps involve setting up an indexer, embedding transcripts from Alex Hormozi's YouTube channel, and handling text splitting methods. Debugging tips and additional add-ons for the RAG system are also discussed.

Gemma 2 - Local RAG with Ollama and LangChain