Kent C. Dodds shares how a long podcast episode caused his Fly.io primary server to hit 400-500% CPU load, forcing him to offload FFmpeg processing to Cloudflare Queues and Containers. The new architecture enqueues a job from the app, a Cloudflare Worker forwards it to a Container that runs FFmpeg, uploads results to R2, and POSTs a signed HMAC-SHA256 callback to the app. The post covers the before/after metrics (85% reduction in peak load), cost tradeoffs between Cloudflare and a dedicated Fly.io machine, and several implementation mistakes caught after the initial PR: a counterproductive local fallback, container lifecycle issues with sleepAfter, and a queue worker that blocked for the full transcode duration. The conclusion reinforces a 'start simple, iterate when reality demands it' philosophy.

11m read timeFrom kentcdodds.com
Post cover image
Table of contents
The publish that finally broke thingsIn defense of the original designWhy the primary machine was the worst place for thisThe new architectureThe before/after comparisonWhat this costWhat I missed on the first passWas it worth it?
2 Comments

Sort: