Contribute to MoonshotAI/Kimi-Linear development by creating an account on GitHub.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Kimi Linear introduces a hybrid linear attention architecture featuring Kimi Delta Attention (KDA), a refined version of Gated DeltaNet with improved gating mechanisms. The 48B parameter model (3B activated) supports 1M token context length, reduces KV cache requirements by 75%, and achieves 6x faster decoding throughput compared to traditional attention methods. Released as open-source with model checkpoints trained on 5.7T tokens, it demonstrates superior performance on long-context tasks while maintaining efficiency through a 3:1 KDA-to-global MLA ratio.

MoonshotAI/Kimi-Linear

<p>Moonshot has been shipping. Good on them.</p>