Agent Skills is a functionality (announced by Anthropic in October 2025) that extends AI agents through skill-based task orchestration using folders and SKILL.md files. This deep-dive covers the security threats and mitigations associated with adopting Agent Skills: filesystem permission hardening, malicious or vulnerable scripts in skills, prompt injection risks and guardrail controls, sandboxed execution environments, credential management best practices, and the experimental allowed-tools field. Tools like TrustyAI, malcontent, and Trufflehog are highlighted as practical controls. The piece emphasizes combining traditional secure development practices with AI-specific controls and governance.

10m read timeFrom developers.redhat.com
Post cover image
Table of contents
How Agent Skills worksImprove security of the skill filesMalicious skillsSecurity vulnerabilitiesPrompt injectionCredentials managementFinal thoughts

Sort: