Anthropic's new Mythos model claims dangerous capabilities in finding security vulnerabilities. The author argues the hype is partially warranted but contextualizes the risk: costs of $10k-20k per vulnerability make it unlikely to be run broadly, and it's best viewed as a pentest add-on. A key insight is that Mythos succeeds largely because of oracles like AddressSanitizer that filter false positives — the same reason agentic AI coding works (type checkers, linters, test suites). Without oracles, LLM-based vulnerability finders drown in false positives. The author warns that AI tools won't fix the root causes of poor software security; real solutions require memory-safe languages, capability-based security, and slower, more deliberate development — not faster AI-assisted code generation.

8m read timeFrom neilmadden.blog
Post cover image
Table of contents
Is it all just hype?The importance of oraclesWhat should we do?

Sort: