Simon Willison compares Qwen3.6-35B-A3B (a 21GB quantized model running locally via LM Studio on a MacBook Pro M5) against Anthropic's newly released Claude Opus 4.7 using his informal 'pelican riding a bicycle' SVG generation benchmark. The local Qwen model wins both the pelican and a flamingo-on-unicycle test. Willison acknowledges this doesn't mean Qwen is generally more capable than Opus 4.7, but notes the result breaks the loose correlation his benchmark previously had with overall model quality.
Sort: