How Grab Built an AI Foundation Model To Understand Customers Better

Grab developed a foundation model using transformer architecture to unify user understanding across its superapp ecosystem. The model processes both tabular data (user profiles, transaction history) and time-series data (clickstream interactions) through specialized adapters for different modalities (text, IDs, locations, numerical values). Using unsupervised pre-training with masked language modeling and next action prediction, it generates dual embeddings (long-term and short-term) for users, merchants, and drivers. The system employs hierarchical classification to handle massive ID vocabularies and supports both fine-tuning for specific tasks and embedding extraction for general features, currently powering ad optimization, fraud detection, and churn prediction across Grab's platform.

#machine-learning

#embeddings

#recommendation-systems

Nov 17, 2025•18m read time•From blog.bytebytego.com

Table of contents

Break production less with Seer Code Review (Sponsored)Data Foundation Key Challenges in Model Design Architecture Overview - Transformer Backbone Adapter-based Modality Handling Unsupervised Pre-Training Handling of Massive ID Vocabularies Embedding Extraction vs Fine-Tuning Conclusion SPONSOR US

Comment

Bookmark

Copy

Sort: