Michael-A-Kuykendall/shimmy: ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.Read post
Shimmy is a 5.1MB Rust-based inference server that provides 100% OpenAI-compatible API endpoints for local GGUF models. It offers zero-configuration setup with auto-discovery of models from Hugging Face cache, Ollama, and local directories. The tool enables privacy-first AI development by running models locally without external
Sort: