Four companies—Harvey, bolt.new, Shopify, and Lovable—share their experiences testing Claude Opus 4.6 before its public launch. Harvey achieved 90.2% on their legal benchmark, the first Anthropic model to break 90%. bolt.new saw the model diagnose complex bugs on first attempts that previous versions failed after five tries.

6m read time From claude.com
Post cover image
Table of contents
Getting ready for model testingWhen the results start coming inWhat it's like on the other side

Sort: