An investigation into why Tencent's Hy3 preview model is topping OpenRouter's AI model rankings by over 50% compared to Claude, despite having mediocre benchmark results. The analysis explores LLM pricing economics in 2026, including how prompt caching dramatically changes effective costs — DeepSeek V4 Flash served directly by DeepSeek has a 2% cache read cost, making its effective price ~$0.018/1M tokens vs Hy3's $0.034/1M. With 98% of LLM API costs now being input tokens that are aggressively cached, stated prices are increasingly misleading. The author concludes Hy3's popularity likely stems from a single large app using it as a data-processing backbone, and predicts DeepSeek V4 Flash will gain popularity once users understand its true effective pricing.

9m read timeFrom minimaxir.com
Post cover image

Sort: