A new ETH Zurich paper challenges the widespread practice of using AGENTS.md context files with AI coding agents. Testing four agents (Claude 3.5 Sonnet, GPT-5.2, GPT-5.1 mini, Qwen Code) on 138 real-world Python tasks, researchers found LLM-generated context files reduce task success rates by ~3% and increase inference costs by over 20%. Human-written files offer only a marginal 4% success rate improvement while also raising costs by up to 19%. The core issue: context files cause agents to run more tests, read more files, and perform more checks than necessary, without improving final output quality. Researchers recommend omitting LLM-generated files entirely and limiting human-written ones to non-inferable details like custom build commands. Community reaction is mixed, with some developers arguing the study actually validates high-quality, domain-specific AGENTS.md files for larger closed-source projects.

5m read timeFrom infoq.com
Post cover image

Sort: