lm.rs enables running inference on Language Models locally on the CPU using Rust. The project now supports multimodal models like PHI-3.5-vision, in addition to text-only models like PHI-3.5-mini and Llama 3.2. Currently, image processing is being optimized to reduce latency. The guide includes steps for converting models to the LMRS format, compiling Rust code, and running both the CLI and WebUI interfaces. Future plans include adding sampling methods, testing larger models, and improving quantization support.

3m read timeFrom github.com
Post cover image
Table of contents
Prepared modelsInstructionsTODOsLicense

Sort: