Background Coding Agents: Predictable Results Through Strong Feedback Loops (Part 3)

Spotify's background coding agents use strong verification loops to ensure reliable automated code changes at scale. The system implements multiple layers of validation: deterministic verifiers that check formatting, building, and testing; and an LLM-based judge that prevents agents from going beyond their instructions. This architecture addresses three failure modes: failed PR generation, PRs that fail CI, and functionally incorrect code that passes CI. The verification loops provide incremental feedback while abstracting complexity from the agent's context window. The judge vetoes about 25% of agent sessions, with agents successfully course-correcting half the time. Future plans include expanding verifier infrastructure to support more platforms, deeper CI/CD integration, and implementing structured evaluations.

#ai

#automation

#llm

Dec 09, 2025•7m read time•From engineering.atspotify.com

Table of contents

How things fail Designing for predictability: verification loops Using LLMs in the verification loops Keeping the Agent Focused The Future

Comment

Bookmark

Copy

Sort: