Anthropic released Claude Opus 4.6, introducing two key architectural features for long-running agentic workflows: Adaptive Thinking (four granular reasoning effort levels: low, medium, high, max) and Context Compaction (automatic summarization of earlier context to prevent 'context rot'). The model supports a 1M token context window in beta with 76% multi-needle retrieval accuracy at that scale, and doubles max output to 128K tokens. Thinking tokens are billed at $25/million output tokens, with a long-context premium kicking in above 200K tokens. The release also includes Agent Teams in Claude Code (research preview) for parallel multi-agent coordination, and Claude integration into PowerPoint. Anthropic claims top benchmark scores on Terminal-Bench 2.0, Humanity's Last Exam, GDPval-AA, and BrowseComp. Independent testing by Quesma found only 49% binary backdoor detection accuracy, and Hacker News users reported regressions from Opus 4.5. The model is available on Azure Foundry, AWS Bedrock, Google Vertex AI, and GitHub Copilot.
Sort: