A conference talk by Jason Haddix covering offensive security methodology for AI systems. Drawing on 2+ years of AI pentesting experience, it walks through a custom LLM assessment methodology covering input identification, ecosystem attacks, prompt engineering leakage, RAG data exfiltration, and pivoting to internal systems. Real-world case studies include bypassing Amazon Rufus guardrails via ASCII encoding, exfiltrating system prompts and API keys from healthcare and automotive enterprise AI systems, and attacking a SIEM vendor's AI integration. The talk introduces a Metasploit-inspired taxonomy of prompt injection techniques (intents, techniques, evasions, utilities), demonstrates tools like Parcel Tongue for evasion generation, and provides resources including 23+ practice labs, bug bounty programs, and CTF competitions for learning AI red teaming.
Sort: