Gusto's engineering team describes how they replaced fragile, one-off document parsers with a Universal Document Processing (UDP) platform. Facing millions of documents per year across payroll, benefits, and compliance workflows, they decomposed the problem into five composable API-backed stages: ingestion, classification, extraction, validation, and mapping. AI is used as a model-agnostic abstraction layer rather than a hard dependency, with confidence scores and fallback routing as first-class concepts. The result is a self-service platform where teams can onboard new document types through configuration rather than code, business users own mapping logic, and the system gracefully handles failures. Key lessons include prioritizing domain understanding over model selection, investing in confidence calibration, and treating the mapping layer as a high-complexity concern.

9m read timeFrom engineering.gusto.com
Post cover image
Table of contents
IntroductionThe Problem: It Wasn’t Documents — It Was ScaleWhy We Bet on AI as an Abstraction LayerArchitecture: Compartmentalized, API-FirstThe Five Core StagesGet Thomas Taylor ’s stories in your inboxWhy This Decomposition MattersFrom Solution to PlatformWhat’s Now PossibleLessons LearnedWhat’s Next

Sort: