Ollama and Hugging Face have announced a collaboration allowing access to GGUF models on Hugging Face's hub, totaling around 45,000 models. Users can easily run these models using the Ollama run command, with options to choose different levels of model quantization (from 2-bit to 8-bit). The post provides guidance on selecting the appropriate quantization format based on performance and quality trade-offs. This new feature streamlines the process of deploying diverse models quickly and efficiently.
•7m watch time
1 Comment
Sort: