An exploration of using Claude Code to generate API tests for a Spring Boot banking application, starting from scratch with only a prompt. Claude produced 23 passing tests in minutes, achieving 95% line coverage and 91% mutation coverage as measured by PITest. However, analysis revealed gaps: several critical code paths were missed (HTTP 500 handling, empty account list, boundary values), and 4 of the 23 tests (17%) were dead weight that didn't uniquely contribute to coverage. The author emphasizes that evaluating AI-generated tests requires domain knowledge, testing experience, and tools like mutation testing — simply accepting passing tests at face value risks a test suite full of holes.

12m read timeFrom ontestautomation.com
Post cover image
Table of contents
The starting pointA first look at the testsTesting the generated tests with mutation testingLooking at the surviving mutantsIdentifying dead weight in our test suiteConclusions
1 Comment

Sort: