Rubber Duck uses a second model from a different AI family to evaluate the primary agent’s plans, question assumptions, and raise concerns.

InfoWorld is a source of news, analysis, and commentary on technology trends, IT strategies, and business innovation. With a focus on enterprise technology and digital transformation, InfoWorld offers insights and guidance for IT decision-makers, software developers, and technology professionals. From  articles on cloud computing and cybersecurity to product reviews and industry trends, InfoWorld helps readers navigate the complexities of modern IT environments and make informed decisions to drive business success.

InfoWorld

GitHub has introduced an experimental 'Rubber Duck' mode in GitHub Copilot CLI that uses a second AI model from a different family to independently review the primary agent's plans before execution. Acting as a focused review agent, Rubber Duck identifies missed details, questionable assumptions, and edge cases. Benchmarked on SWE-Bench Pro, pairing Claude Sonnet 4.6 with Rubber Duck running GPT-5.4 closed 74.7% of the performance gap between Sonnet and Opus, with the biggest gains on complex multi-file problems. Developers can access it via the /experimental flag in Copilot CLI.

GitHub Copilot CLI adds Rubber Duck review agent