The post explains how to install and run the DeepSeek-R1 model, highlighting the importance of adding BOS and EOS tokens in interactions. It provides detailed setup instructions using commands like `apt-get update` for dependencies, downloading the model via `huggingface_hub`, and outlines how to configure GPU offloading based on available memory. Additionally, there's guidance on quantizing the model's K cache to 4bit and running the model using those configurations.
Table of contents
Jan 27, 2025 • By Daniel & MichaelJan 27, 2025•By Daniel & MichaelDeepSeek Original1.58-bit Version🦙 Run in Ollama/vLLM3 Comments
Sort: