Agent Skills is a functionality (announced by Anthropic in October 2025) that extends AI agents through skill-based task orchestration using folders and SKILL.md files. This deep-dive covers the security threats and mitigations associated with adopting Agent Skills: filesystem permission hardening, malicious or vulnerable scripts in skills, prompt injection risks and guardrail controls, sandboxed execution environments, credential management best practices, and the experimental allowed-tools field. Tools like TrustyAI, malcontent, and Trufflehog are highlighted as practical controls. The piece emphasizes combining traditional secure development practices with AI-specific controls and governance.

Table of contents
How Agent Skills worksImprove security of the skill filesMalicious skillsSecurity vulnerabilitiesPrompt injectionCredentials managementFinal thoughtsSort: