Four customer teams tested Opus 4.6 before anyone else. See their testing approaches, technical breakthroughs, and the feedback that shaped the release.

Claude

Four companies—Harvey, bolt.new, Shopify, and Lovable—share their experiences testing Claude Opus 4.6 before its public launch. Harvey achieved 90.2% on their legal benchmark, the first Anthropic model to break 90%. bolt.new saw the model diagnose complex bugs on first attempts that previous versions failed after five tries. Shopify engineers noted improved instruction following and autonomous behavior, with the model anticipating needs beyond explicit requests. Lovable observed enhanced design quality and greater autonomy. The early access period allows customers to stress-test the model against real workloads and provide candid feedback that shapes the final release.

Harvey, Bolt, Shopify & Lovable Test Claude Opus 4.6 Pre-Launch