The post argues that watermarking, or the process of embedding detectable patterns in AI-generated text, is ineffective in distinguishing between AI-generated and human-generated content. It discusses three critical conditions that render watermarking ineffective: the existence of capable LLMs that don't implement watermarking, the allowance of user control over token selection, and the availability of open-source models. The author questions the true goals behind watermarking and suggests that focusing on detecting harmful content directly would be more effective in reducing AI-generated harms.
Table of contents
Why LLM watermarking will never workWhat is watermarking?So, we can identify text from an LLM?Why will watermarking never work?1 Comment
Sort: