Please subscribe to our YouTube channel @ https://www.youtube.com/@DevoxxForever
Subscribe to LinkedIn @ https://www.linkedin.com/company/voxxed-days-amsterdam
Follow us on Twitter @ https://twitter.com/voxxedamsterdam

We’re in the middle of another leap in abstraction.

Like compilers, cloud, and containers before it, AI coding agents arrived with hype, fear, and broken assumptions. We gave the monkeys GPUs. Sometimes they output Shakespeare. Other times, they confidently ship code that compiles, passes tests, and still does the wrong thing.

The problem is simple: intent gets lost between what we mean, what we ask for, and what actually runs.

This talk delivers a practical model for software development with AI coding agents built on three equally essential ideas:

The Chasm: the divide between human intent and what is actually expressed to an AI coding agent.
The Context: the shared, explicit, and reusable knowledge an AI coding agent operates within. APIs, conventions, constraints, and domain rules replace guessing.
The Chain: the Intent Integrity Chain. A structured flow of prompt → spec → test → code, at each stage produces a verifiable artifact and is validated externally and grounded in a shared context at every stage.

Together, these form a system where intent survives implementation. Natural language becomes specifications. Specifications become tests. Tests become code. Every step is grounded in a shared context instead of assumptions and is never validated by the same model. This approach is informed by recurring failure patterns observed in real AI agents development workflows: systems passed tests, shipped successfully, yet still failed to meet intent.

Devoxx

A conference talk exploring the trust problem with AI-generated code, using the 'infinite monkey theorem' as a metaphor for LLMs. The speaker shares a personal experience where AI produced 300 passing tests with 95% coverage, yet the code didn't actually work — illustrating the 'intent to code chasm.' The talk argues that spec-driven development alone is insufficient because it creates circular verification (AI writes code, AI tests code). The proposed solution is an 'intent integrity chain': humans write prompts, LLMs generate human-readable specs, algorithms convert specs into locked/immutable tests, and LLMs implement code that must pass those tests. Locking test assertions prevents AI from gaming the verification process. The speaker introduces an open-source tool called 'Intent Integrity Kit' that implements this workflow.

Never Trust a Monkey! Can We Trust AI-Generated Code? by Baruch Sadogursky