Agoda built a Multimodal Content System that bridges the gap between hotel images and guest reviews by introducing a shared topic taxonomy. The system clusters images and reviews under common topics (e.g., Breakfast, Pool, Room Quality), enabling a unified view with relevant images, review snippets, and sentiment scores. At scale, it processes 450M+ images and millions of multilingual reviews daily using PySpark jobs orchestrated via Kubeflow, with results served from Couchbase. Three core pipelines handle image curation, review snippet extraction, and sentiment aggregation per topic. A/B tests showed doubled click rates on multimodal highlights on mobile and consistently positive engagement metrics across platforms. The next step is consolidating all content signals through LLMs into a single unified model.

8m read timeFrom medium.com
Post cover image
Table of contents
The Problem: Siloed ContentThe User Journey: Manual Cross-ReferencingScale of the ChallengeTechnical ImplementationImage Quality ScoringTopics: The Shared BackboneGet Agoda Engineering’s stories in your inboxTopic Mapping ArchitectureTechnical Deep-Dive: Topic-Based Image CurationTechnical Deep-Dive: Topic-Based Review SnippetsTechnical Deep-Dive: Sentiment AggregationPutting It Together: The Multimodal Content PackageDelivering Multimodal Insights

Sort: