Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

PrismML has released Ternary Bonsai, a family of 1.58-bit language models available in 8B, 4B, and 1.7B parameter sizes. Using ternary weights {-1, 0, +1} throughout the entire network — including embeddings, attention, MLPs, and LM head — the models achieve roughly 9x smaller memory footprint than standard 16-bit models. The 8B variant fits in 1.75 GB and scores 75.5 on average benchmarks, outperforming all comparable-size models except Qwen3 8B (which is ~9x larger). On Apple M4 Pro hardware, the 8B model runs at 82 tokens/sec with ~5x better energy efficiency than 16-bit counterparts. Models are available under Apache 2.0 and run natively on Apple devices via MLX. This builds on PrismML's earlier 1-bit Bonsai family, offering a new tradeoff point between memory and performance.

PrismML — Introducing Ternary Bonsai: Top Intelligence at 1.58 Bits