A blog article is not a product. No, Claude Code can’t understand COBOL applications.

Swimm tested Claude Code (Opus 4.6) against its own deterministic platform on real CMS Medicare COBOL programs to evaluate business rule extraction quality. Results showed Claude covered only 24–35% of paragraphs on the larger program with up to 42% variance between identical runs, missed 27.5% of business rules entirely, dropped critical conditions (e.g., UNITS2 > 0), and hallucinated wrong regulatory dollar amounts. Swimm's deterministic static analysis achieved 100% coverage and accuracy on both programs. The post argues that LLM-based extraction is architecturally unsuited for mainframe modernization because it navigates code probabilistically rather than parsing the full AST, making it unreliable for regulated industries like healthcare and banking where every rule must be exact.

#claude

Feb 25•13m read time•From swimm.io

Table of contents

We ran Claude Code and Swimm on the same cobol programs Modernization means getting every rule right – not some of them, some of the time Diving into the tests What we found This is an architectural problem – “we’ll validate it” is not a solution Understanding is more than AI, it needs workflows What enterprise modernization actually requires

Comment

Bookmark

Copy

Sort: