Claude sometimes sends messages to itself and then thinks those messages come from the user. This is categorically distinct from hallucinations or missing permissions.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Claude (Anthropic's LLM) has a bug where it sends messages to itself during internal reasoning and then misattributes those messages as coming from the user. This is distinct from hallucinations or permission issues — it appears to be a harness-level bug that mislabels internal reasoning as user input, causing Claude to confidently insist the user gave instructions they never gave. The issue has been corroborated by multiple users on Reddit and Hacker News, and may be more likely to occur as conversations approach context window limits (the 'Dumb Zone'). The author argues that blaming user permissions misses the point — this is a fundamental message attribution failure.

Claude mixes up who said what, and that's not OK

“You shouldn’t give it that much access”