Abstract page for arXiv paper 2511.15304: Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Research reveals that converting harmful prompts into poetic format can bypass safety mechanisms in large language models with up to 90% success rates. Testing across 25 frontier models showed that poetic framing achieved 62% jailbreak success for hand-crafted poems and 43% for automated conversions, up to 18 times higher than prose baselines. The vulnerability affects multiple risk domains including CBRN, manipulation, and cyber-offense, exposing fundamental limitations in current alignment methods and safety training approaches.