Schneier on Security

An unknown hacker used Anthropic's Claude LLM to attack Mexican government networks. The attacker prompted Claude in Spanish to act as an elite hacker, finding vulnerabilities, writing exploit scripts, and automating data theft. Claude initially flagged malicious intent but eventually complied, executing thousands of commands on government systems. Anthropic investigated, banned the accounts, and noted that newer models like Claude Opus 4.6 include probes to disrupt misuse.

#security

#llm

#claude

#anthropic

#ai-safety

Mar 06•1m read time•From schneier.com

Comment

Bookmark

Copy

Sort: