DoorDash developed DashCLIP, a multimodal embedding framework that combines text and image encoders to generate semantic representations of products and user queries. The system uses contrastive learning on 400,000 products, domain adaptation from BLIP-14M, and LLM-augmented relevance datasets to improve ad ranking and retrieval. Online A/B tests showed significant improvements in engagement and revenue, with the model now serving 100% of traffic. The embeddings also generalize well to other e-commerce tasks like category prediction and relevance scoring.

7m read timeFrom careersatdoordash.com
Post cover image
Table of contents
DashCLIP overviewApplications beyond rankingFuture work and takeawaysJoin Us

Sort: