GitHub Copilot CLI introduces 'Rubber Duck' in experimental mode, a cross-model review agent that uses a model from a different AI family to critique the primary agent's work. When using a Claude model as the orchestrator, Rubber Duck runs GPT-5.4 to independently review plans, implementations, and tests. Benchmarks on
Table of contents
The problem: Confident mistakes can compoundRubber Duck adds a second perspectiveGetting startedTags:Written bySort: