A research paper introduces 'constraint decay', a phenomenon where LLM coding agents significantly degrade in performance as structural requirements accumulate in backend code generation tasks. Evaluated across 80 greenfield and 20 feature-implementation tasks spanning eight web frameworks, capable agent configurations lose an average of 30 assertion pass-rate points when moving from baseline to fully specified tasks. Agents perform better in minimal frameworks like Flask but struggle with convention-heavy environments like FastAPI and Django. Data-layer defects — including incorrect query composition and ORM violations — are identified as the leading root causes. The study highlights that satisfying both functional and structural requirements simultaneously remains an unsolved challenge for AI coding agents.

2m read timeFrom arxiv.org
Post cover image

Sort: