Unit 42 researchers analyzed two real-world malware samples that integrate LLMs into their operation. The first is a .NET infostealer that calls OpenAI's GPT-3.5-Turbo API for evasion technique naming and C2 communication, but the AI integration is largely non-functional — technique names are logged but never executed, making it 'AI theater.' The second is a Golang-based dropper for Sliver malware that uses GPT-4 to assess whether a target environment is safe before deploying its payload, replacing traditional hardcoded allow/deny lists with LLM-based decision-making. Unit 42 identifies three categories of AI use in malware: AI-written malware, AI-assisted remote decision-making (C2), and locally executed agentic flows — only the first two have been observed in the wild. The research concludes that while current AI integration in malware is immature, it lowers the barrier for less-skilled attackers and signals a concerning trajectory toward more capable AI-driven threats.
Table of contents
Executive SummaryAI Theater: A .NET Infostealer’s Illusory LLM FeaturesAI-Gated Execution: A Malware Dropper's LLM-Based Safety AssessmentConclusionIndicators of CompromiseSort: