Never Trust a Monkey! Can We Trust AI-Generated Code? by Baruch Sadogursky
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
A conference talk exploring the trust problem with AI-generated code, using the 'infinite monkey theorem' as a metaphor for LLMs. The speaker shares a personal experience where AI produced 300 passing tests with 95% coverage, yet the code didn't actually work — illustrating the 'intent to code chasm.' The talk argues that spec-driven development alone is insufficient because it creates circular verification (AI writes code, AI tests code). The proposed solution is an 'intent integrity chain': humans write prompts, LLMs generate human-readable specs, algorithms convert specs into locked/immutable tests, and LLMs implement code that must pass those tests. Locking test assertions prevents AI from gaming the verification process. The speaker introduces an open-source tool called 'Intent Integrity Kit' that implements this workflow.
Sort: