AI Agent Fleet for CI/CD: How Docker Ships Faster

Docker's Coding Agent Sandboxes team built a "Fleet" of seven AI agent roles that run autonomously in CI to test, triage, and fix code. Built on Claude Code skills (markdown role-description files), the Fleet includes a CLI tester with 52+ scenarios across 14 tiers, a project manager for deduplication and issue tracking, a product owner for daily release notes, performance and upgrade testers, and a software engineer that auto-fixes labeled issues and reduces tech debt weekly. The system uses a "Ralph-loop" pattern — a worker/reviewer iteration cycle — to generate and evaluate code changes, producing pull requests for human review. Key design principles: build skills as roles not scripts, develop locally before CI, compose skills like team members, and always separate generation from evaluation. The Fleet creates PRs but never merges them — merge authority stays with humans.

#testing

#docker

#cicd

#ai-agents

#claude-code

May 01•14m read time•From docker.com

Table of contents

Local First, CI Second The Roster Skills That Compose The Ralph-Loop Is the Engine What the Fleet Ships What We Don’t Automate What We Learnt Building the Fleet

Comment

Bookmark

Copy

Sort: