Cloudflare shares findings from Project Glasswing, where they tested Mythos Preview — a security-focused LLM from Anthropic — against 50+ of their own repositories. The model stands out for exploit chain construction (chaining multiple low-severity bugs into a working exploit) and automated proof generation (writing, compiling, and running PoC code in a loop). Key observations include: memory-unsafe languages (C/C++) produce more false positives, model bias toward over-reporting requires multi-stage validation, and single coding agents are the wrong shape for broad vulnerability coverage. Cloudflare built a multi-stage harness (Recon → Hunt → Validate → Gapfill → Dedupe → Trace → Feedback → Report) running ~50 parallel narrow-scoped agents to achieve meaningful coverage. They also caution against the industry's focus on speed alone — patching faster without fixing the underlying architecture (defense-in-depth, blast radius reduction, instant rollout) misses the point. The same capabilities that help defenders will accelerate attackers against every application on the internet.
Table of contents
What changed with Mythos PreviewModel refusals in legitimate vulnerability researchThe signal-to-noise problemWhy pointing a generic coding agent at a repo doesn't workWhat a harness actually fixesOur vulnerability discovery harnessWhat this means for security teamsSort: