unknown

Shimmy is a 5.1MB Rust-based inference server that provides 100% OpenAI-compatible API endpoints for local GGUF models. It offers zero-configuration setup with auto-discovery of models from Hugging Face cache, Ollama, and local directories. The tool enables privacy-first AI development by running models locally without external API calls, supporting hot model swapping and automatic port allocation. Compatible with existing OpenAI SDKs and popular development tools like VSCode, Cursor, and Continue.dev.

Michael-A-Kuykendall/shimmy: ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

Become a cool developer with Dev Source! Your ultimate dev source for resources, insights, and a thriving community to learn, grow, and stay ahead every single day!