GitHub Security Lab's open source Taskflow Agent framework uses LLMs to automate security auditing of codebases. The framework works through a multi-stage pipeline: threat modeling to identify components and entry points, an issue suggestion stage where the LLM brainstorms likely vulnerability types, and a rigorous audit stage that verifies findings with concrete evidence. Running on 40+ repositories, it surfaced 80+ vulnerabilities including a privilege escalation in Outline, PII exposure in WooCommerce and Spree ecommerce apps, and a critical authentication bypass in Rocket.Chat caused by a missing `await` on a bcrypt Promise. The framework is particularly effective at finding logic bugs like IDORs and auth bypasses, achieving a ~21% high-severity true positive rate. It's open source, requires a GitHub Copilot license, and can be run with a single shell command against any public repo.

27m read timeFrom github.blog
Post cover image
Table of contents
How to run the taskflows on your own projectIntroduction to taskflowsTaskflows for general security code auditsGeneral taskflow designThreat modeling stageIssue suggestion stageIssue audit stageThree examples of vulnerabilities found by the taskflowsWhat we learnedGet involved!

Sort: