Tim Bray shares lessons learned while implementing `+` and `*` regexp features in Quamina, his open-source pattern-matching library. Key takeaways include: using sample-driven development with 992 pre-existing test cases, decomposing regexp features into incremental deliverables, consulting Thompson's Construction (and Russ Cox's writeup) before rolling your own, a pragmatic 'list crushing' technique to avoid O(2^N) state explosion in NFA traversal, and honest reflection on a benchmark where merging 13K wildcard patterns still underperforms. The post also previews remaining unimplemented features like `{lo,hi}`, Unicode property matchers, and complementary character classes, and closes with personal reflections on coding in later life.

7m read timeFrom tbray.org
Post cover image

Sort: