Minimal LLM inference in Rust. Contribute to samuel-vitorino/lm.rs development by creating an account on GitHub.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

lm.rs enables running inference on Language Models locally on the CPU using Rust. The project now supports multimodal models like PHI-3.5-vision, in addition to text-only models like PHI-3.5-mini and Llama 3.2. Currently, image processing is being optimized to reduce latency. The guide includes steps for converting models to the LMRS format, compiling Rust code, and running both the CLI and WebUI interfaces. Future plans include adding sampling methods, testing larger models, and improving quantization support.

samuel-vitorino/lm.rs: Minimal LLM inference in Rust