Mozilla shares a detailed behind-the-scenes account of how they used Claude Mythos Preview and other AI models to identify and fix 271 security bugs in Firefox. The team built an agentic harness on top of existing fuzzing infrastructure that can dynamically create and run reproducible test cases to confirm real bugs and dismiss false positives. Starting with Claude Opus 4.6, they parallelized jobs across ephemeral VMs targeting specific code areas. The pipeline covers the full security bug lifecycle: discovery, deduplication, triage, patching, and release. In total, 423 security bugs were fixed across April releases, including 180 sec-high, 80 sec-moderate, and 11 sec-low among the 271 Claude Mythos-identified bugs. Mozilla recommends other projects adopt similar agentic harnesses now, noting that simple initial prompts can be iterated into a scalable pipeline, and plans to integrate patch-based scanning into their CI system.
Table of contents
Suddenly, the bugs are very goodHarnessing Models to Build a Hardening PipelineUpgrading the ModelsTakeawaysFAQAbout Brian GrinsteadAbout Christian HollerAbout Frederik BraunSort: