Why you shouldn't share your context window with others

Lobsters is a community-driven platform for sharing and discussing links to articles, tutorials, and projects related to technology and programming. Readers can learn about a wide range of topics, from software development and system administration to cybersecurity and artificial intelligence. With user submissions, comments, and voting, Lobsters provides a platform for collaborative learning and knowledge sharing among technology enthusiasts.

Lobsters

Prompt injection attacks — reframed here as 'Disregard that!' attacks — are a fundamental and largely unsolved security problem for LLM-based applications. Any time untrusted content enters an LLM's context window (user messages, web search results, API responses, shared file systems), an attacker can override the system's instructions. Common mitigations like AI guardrails, multi-agent layering, and structured input validation all fail because free-text processing is central to LLM value. The only approaches that actually work are: restricting the context window to fully trusted content, accepting limited risk in low-stakes scenarios, requiring human review of LLM actions, or having the LLM generate traditional code that is reviewed before execution. The post argues this problem is likely insoluble for public-facing chatbots and that even major providers like OpenAI face the same whack-a-mole battle.

"Disregard that!" attacks

"Disregard that!" - context window takeover