Swimm tested Claude Code (Opus 4.6) against its own deterministic platform on real CMS Medicare COBOL programs to evaluate business rule extraction quality. Results showed Claude covered only 24–35% of paragraphs on the larger program with up to 42% variance between identical runs, missed 27.5% of business rules entirely,
•13m read time• From swimm.io
Table of contents
We ran Claude Code and Swimm on the same cobol programsModernization means getting every rule right – not some of them, some of the timeDiving into the testsWhat we foundThis is an architectural problem – “we’ll validate it” is not a solutionUnderstanding is more than AI, it needs workflowsWhat enterprise modernization actually requiresSort: