A research paper introduces 'constraint decay', a phenomenon where LLM coding agents significantly degrade in performance as structural requirements accumulate in backend code generation tasks. Evaluated across 80 greenfield and 20 feature-implementation tasks spanning eight web frameworks, capable agent configurations lose an average of 30 assertion pass-rate points when moving from baseline to fully specified tasks. Agents perform better in minimal frameworks like Flask but struggle with convention-heavy environments like FastAPI and Django. Data-layer defects — including incorrect query composition and ORM violations — are identified as the leading root causes. The study highlights that satisfying both functional and structural requirements simultaneously remains an unsolved challenge for AI coding agents.
Sort: