Anthropic's Claude Opus 4.6 reportedly found over 500 previously unknown high-severity vulnerabilities in open source libraries using reasoning-based code analysis rather than traditional fuzzing. While this represents real progress in automated vulnerability discovery, the post argues that discovery is rarely the hardest part of security work. The real challenges — reachability analysis, exploitability validation, triage, and safe remediation — remain system-level problems that LLMs alone cannot solve. The post also warns that model upgrades introduce their own risks: version drift, changing outputs, and compounded error rates in agentic setups require controlled benchmarking and continuous evaluation. Reviewing code is not the same as validating exploitability, and runtime context is still essential for reliable security workflows.

5m read timeFrom aikido.dev
Post cover image
Table of contents
Discovery Isn’t the Only BottleneckWhat Actually Changes For Software SecurityWhy Model Upgrades Introduce RiskReviewing Code Is Not Validating ExploitabilityWhat Actually Changes for Teams

Sort: