GitHub has introduced an experimental Copilot CLI feature called Rubber Duck that pairs a primary AI model with a reviewer from a different AI family to catch errors the primary model might miss. When Claude Sonnet 4.6 is the primary orchestrator, GPT-5.4 acts as the reviewer. Testing on SWE-Bench Pro showed this pairing closes
Table of contents
What Rubber Duck DoesThe Performance NumbersWhen Rubber Duck Kicks InWhat This Means for Development TeamsHow to Try ItSort: