OpenAI scaled a single-primary PostgreSQL instance to handle millions of queries per second for ChatGPT's 800 million users by deploying nearly 50 geo-distributed read replicas on Azure, optimizing query patterns, and offloading write-heavy workloads to sharded systems like Azure Cosmos DB. Key strategies included connection pooling with PgBouncer, reducing write pressure through application-level tuning, implementing cascading replication to reduce primary load, and isolating critical workloads to maintain low-latency performance under global traffic spikes.

3m read timeFrom infoq.com
Post cover image

Sort: