Scary Agent Skills: Hidden Unicode Instructions in Skills ...And How To Catch Them · Embrace The Red
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
AI agent Skills can be backdoored with invisible Unicode Tag instructions that survive human review. The attack exploits how certain LLMs (Gemini, Claude, Grok) interpret hidden Unicode codepoints as executable instructions. A demonstration shows backdooring OpenAI's security-best-practices Skill to execute arbitrary bash
•8m read time• From embracethered.com
Table of contents
Attack SurfaceWhat is an Agent Skill?Scary SkillsWriting a Simple SkillPrompt Injection Attack VectorsAgent(s) Overwriting Skills on the FlyUsing Invisible Instructions in SkillsAdding a Backdoor to A Legitimate SkillEnd to End VideoNotes, Testing Observations and MitigationsA Scanner to Catch AttacksConclusionReferencesAppendixSort: