Learn how Discord scaled its search infrastructure to index trillions of messages and unlock new features.

Lobsters is a community-driven platform for sharing and discussing links to articles, tutorials, and projects related to technology and programming. Readers can learn about a wide range of topics, from software development and system administration to cybersecurity and artificial intelligence. With user submissions, comments, and voting, Lobsters provides a platform for collaborative learning and knowledge sharing among technology enthusiasts.

Lobsters

Discord scaled its search infrastructure to efficiently index trillions of messages using Elasticsearch and Kubernetes. They faced challenges like message drops, fault-intolerant bulk indexing, and performance issues due to large clusters. Solutions included deploying Elasticsearch on Kubernetes, adopting a multi-cluster architecture with dedicated nodes, and using PubSub for message queuing. Key achievements include improved indexing throughput, reduced query latency, and seamless cluster upgrades.

How Discord Indexes Trillions of Messages

Multi-cluster “cell” architecture to run smaller Elasticsearch clusters

Batch messages by cluster and index before bulk indexing

Give “Big Freaking Guilds” dedicated Elasticsearch clusters, with multiple shards

<p>That’s honestly crazy. The scale is mind-boggling!</p>