An ASI evaluation shows Anthropic's Claude Mythos Preview can execute multi-stage cyberattacks and solve expert-level CTF challenges.

The New Stack is a publication covering trends and technologies in cloud-native development, DevOps, and software delivery. Developers can learn about containerization, Kubernetes, and cloud computing, as well as explore topics such as microservices architecture, serverless computing, and continuous integration/continuous delivery (CI/CD) pipelines.

The New Stack

The UK AI Security Institute (ASI) evaluated Anthropic's Claude Mythos Preview and found it capable of autonomously executing multi-stage cyberattacks. It became the first AI model to complete a 32-step corporate network takeover simulation (succeeding in 3 out of 10 attempts, averaging 22/32 steps), and solved expert-level CTF challenges 73% of the time. Access to the model is restricted to select organizations via Anthropic's Project Glasswing initiative. ASI cautions that results apply to weakly defended systems and real-world performance against hardened targets remains uncertain.

Claude Mythos Preview completes full cyberattack simulation for the first time

Claude Mythos Preview: Too hot to handle?

The first AI model to autonomously execute a 32-step corporate network takeover

It completed expert-level tasks 73% of the time

What the results do — and don’t — mean