Instacart rebuilt their query understanding system using LLMs to better handle long-tail searches and ambiguous queries. The team progressed from context-engineering with RAG to fine-tuning smaller models like Llama-3-8B, consolidating multiple specialized models into a unified system. They implemented a hybrid architecture: an
Table of contents
IntroductionChallenges in Traditional Query UnderstandingThe Advantages of LLMsLLM as QU: Our Strategy in Action1. Query Category Classification2. Query Rewrites3. Semantic Role Labeling (SRL)Get Yuanzheng Zhu’s stories in your inboxBuilding a New Foundation: Fine-Tuning for Real-Time InferenceDistilling Knowledge via Fine-TuningThe Path to Production: Taming Real-Time LatencyKey TakeawaysSort: