Instacart built Maple, an internal service for large-scale LLM batch processing that handles millions of prompts efficiently. The system automates batching, file management, retries, and cost tracking while reducing LLM costs by up to 50% compared to real-time APIs. Built with Temporal for fault tolerance and using Parquet files for efficient storage, Maple processes jobs averaging 2.6 prompts per second with most batches completing under 12 hours. The service abstracts complexity from development teams and supports both batch and real-time LLM providers through a unified interface.

10m read timeFrom tech.instacart.com
Post cover image
Table of contents
Why We Built MapleHow Maple WorksWhere Maple Fits in the AI StackUnder the HoodGet Paul Baranowski’s stories in your inboxLLM Batch Processing: How fast is it?Lessons LearnedExtending Maple to Additional LLM ProvidersAdoption and Impact

Sort: