How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time

Figma's original data pipeline used a daily full-table sync from Amazon RDS PostgreSQL to Snowflake. As the company grew, this approach became untenable — the largest tables took days to sync, and dedicated export replicas cost millions annually. Figma rebuilt the pipeline around Change Data Capture (CDC), streaming change events through Kafka and applying incremental merges into Snowflake on a configurable schedule (default 3 hours, down to 30 minutes for billing). They built in-house rather than using vendors due to cost and scale concerns. A rigorous validation workflow bootstraps an independent copy of each table weekly and compares it cell-by-cell to catch silent failures. The result: data freshness dropped from 30+ hours to under 3 hours, tables 10x larger are handled reliably, and eliminating dedicated replicas saved millions per year.

#backend

#postgresql

#kafka

#snowflake

#change-data-capture

May 12•11m read time•From blog.bytebytego.com

Table of contents

Harness engineering for agentic code review (Sponsored)When SELECT * Becomes Your Bottleneck Incremental Synchronization Trust But Verify Conclusion

Comment

Bookmark

Copy

Sort: