PinchBench v2 is now in active development and seeking community contributors through April 15th, 2026. PinchBench is an open benchmark for evaluating LLM models as OpenClaw coding agents, recently featured by NVIDIA CEO Jensen Huang at a keynote. The v2 release targets 100 tasks with longer task horizons, better verification, and broader domain coverage. Contributors can help in two areas: skills (new tasks, task improvements, success rate coverage) or leaderboard (UI/UX improvements including filtering, model pages, and scoring). Contributions are made via GitHub PRs, and accepted contributors will be recognized in the v2 release.

5m read timeFrom blog.kilo.ai
Post cover image
Table of contents
The Remarkable Rise of PinchBenchWhat We’re BuildingOpen Call for ContributionsHow to ContributeRecognitionGet Involved

Sort: