Why agents DO NOT write most of our code - a reality check

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

An AI engineer shares hands-on experience testing coding agents like Cursor and Claude Code for a week-long feature implementation. Despite industry claims of 25-80% AI-generated code, the experiment revealed significant limitations: agents produced thousands of lines requiring extensive review, ignored coding conventions, made inefficient database queries, and confidently marked incomplete work as finished. Two critical issues emerged: developers lose their mental model of the codebase when AI generates large PRs, and AI lacks self-reflection to accurately assess its own capabilities. While useful for brainstorming, tab completions, and narrow tasks like unit tests, coding agents aren't yet ready to write most production code.

10m read timeFrom dev.to
Post cover image
Table of contents
Experimenting with coding agents in day to day codingThe feature we tried to build (with AI)First try: Running wildTake two: Smaller, incremental changesThe issues that really matterThe good parts of coding agents
3 Comments

Sort: