Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

An analysis of METR's research on LLM coding performance challenges the narrative of continuous improvement. By comparing merge rates (whether code would actually be accepted by maintainers) rather than test-passing rates, the data shows no meaningful improvement in LLM programming ability since early 2025. Using leave-one-out cross-validation and Brier scores, a constant function fits the merge rate data better than a linear upward trend, suggesting LLMs have plateaued in real-world coding quality for over a year despite ongoing hype about capability gains.

Are LLMs not getting better?