Cisco researchers evaluated the security posture of OpenAI's gpt-oss and gpt-oss-safeguard models (20b and 120b variants) against adversarial prompt injection and jailbreak attacks. Key findings: multi-turn attacks are the dominant threat, causing attack success rates to jump 5x–8.5x compared to single-turn attacks across all
Table of contents
Evaluating gpt-oss model securityKey findingsRecommendations for secure deploymentConclusionSort: