The author rewrote pycparser, a widely-used Python C parser with 20M daily downloads, replacing its PLY-based YACC parser with a hand-written recursive descent parser using an LLM coding agent (Codex). The rewrite was motivated by PLY's abandonment, growing parsing conflicts, and maintenance challenges. While the agent completed the initial port in over an hour and passed all 2500+ test cases, significant manual refinement was needed to improve code quality, readability, and performance. The final result is 30% faster, eliminates the PLY dependency, and required only 4-5 hours of human effort versus an estimated 30-40 hours without AI assistance.
Table of contents
The issues with the existing parser implementationThe mental roadblockWhy would this even work? TestsThe initial portA quick note on reviews and branchesThe long tail of goofsThe end resultFollowup - static typingConclusionsSort: