Reddit is like the front page of the Internet. It hosts billions of posts. And a lot many of these posts contain media content such as images, videos, gifs, and so on. While the media content is often stored in object storage, the metadata is different. For example, if you’ve got a video, you might need to store information such as the thumbnail URL, playback URLs, bitrates, and various resolutions.

System Design Codex's resource offers insights, tutorials, and resources for software architects and system designers. Readers can learn about system architecture patterns, scalability principles, and distributed computing concepts. With articles, case studies, and design principles, System Design Codex provides  guidance and expertise for designing robust and scalable software systems.

System Design Codex

Reddit faced challenges handling scattered metadata across multiple systems. To address this, they built a unified media metadata store using AWS Aurora Postgres. This solution supports over 100K read requests per second with low latency. The setup included dual writes, data backfill, and robust data validation using Kafka for Change Data Capture (CDC). They also implemented range-based partitioning to ensure performance and scalability, enabling Reddit to handle expected volume growth efficiently.

How Reddit Serves 100K Metadata Requests Per Second