The post argues that watermarking, or the process of embedding detectable patterns in AI-generated text, is ineffective in distinguishing between AI-generated and human-generated content. It discusses three critical conditions that render watermarking ineffective: the existence of capable LLMs that don't implement watermarking, the allowance of user control over token selection, and the availability of open-source models. The author questions the true goals behind watermarking and suggests that focusing on detecting harmful content directly would be more effective in reducing AI-generated harms.

26m read timeFrom david-gilbertson.medium.com
Post cover image
Table of contents
Why LLM watermarking will never workWhat is watermarking?So, we can identify text from an LLM?Why will watermarking never work?
1 Comment

Sort: