Kimi is open-sourcing the Kimi Vendor Verifier (KVV), a tool designed to help users of open-source models verify the correctness of their inference implementations. The project was motivated by widespread discrepancies between official Kimi API benchmark scores and third-party inference providers, often caused by misuse of decoding parameters or engineering implementation deviations. KVV includes six critical benchmarks selected to expose specific infrastructure failures, upstream collaboration with vLLM/SGLang/KTransformers to fix root causes, pre-release validation access for infrastructure providers, and a public leaderboard for continuous benchmarking transparency. Full evaluation runs on 2x NVIDIA H20 8-GPU servers take approximately 15 hours.
Table of contents
Official Evaluation Results Why We Built KVV Our Solution Testing Cost Estimation An Open Invitation Sort: