Project Glasswing: what Mythos showed us

Cloudflare shares findings from Project Glasswing, where they tested Mythos Preview — a security-focused LLM from Anthropic — against 50+ of their own repositories. The model stands out for exploit chain construction (chaining multiple low-severity bugs into a working exploit) and automated proof generation (writing, compiling, and running PoC code in a loop). Key observations include: memory-unsafe languages (C/C++) produce more false positives, model bias toward over-reporting requires multi-stage validation, and single coding agents are the wrong shape for broad vulnerability coverage. Cloudflare built a multi-stage harness (Recon → Hunt → Validate → Gapfill → Dedupe → Trace → Feedback → Report) running ~50 parallel narrow-scoped agents to achieve meaningful coverage. They also caution against the industry's focus on speed alone — patching faster without fixing the underlying architecture (defense-in-depth, blast radius reduction, instant rollout) misses the point. The same capabilities that help defenders will accelerate attackers against every application on the internet.

#llm

#ai-agents

#vulnerability

May 18•14m read time•From blog.cloudflare.com

Table of contents

What changed with Mythos Preview Model refusals in legitimate vulnerability research The signal-to-noise problem Why pointing a generic coding agent at a repo doesn't work What a harness actually fixes Our vulnerability discovery harness What this means for security teams

Comment

Bookmark

Copy

Sort: