llmfit is a Rust-based terminal tool that detects your system's RAM, CPU, and GPU specs, then scores hundreds of LLM models across quality, speed, fit, and context dimensions to tell you which ones will actually run on your hardware. It features an interactive TUI and classic CLI mode, supports multi-GPU setups, MoE architectures (with proper expert offloading for models like Mixtral and DeepSeek), dynamic quantization selection, and speed estimation per backend (CUDA, Metal, ROCm, etc.). It integrates with local runtime providers Ollama, llama.cpp, and MLX, allowing direct model downloads from the TUI. A Plan mode lets you reverse the analysis to estimate what hardware a given model configuration requires.

16m read timeFrom github.com
Post cover image
Table of contents
InstallUsageHow it worksModel databaseProject structurePublishing to crates.ioDependenciesRuntime provider integrationPlatform supportContributingOpenClaw integrationAlternativesLicense

Sort: