LLM agents can autonomously exploit one-day vulnerabilities in real-world systems, as shown in this work. GPT-4 has a high performance in exploiting these vulnerabilities when provided with CVE descriptions.
Sort: