EvenUp, an AI-powered legal tech company, migrated from a custom workflow orchestrator to Temporal to address scaling, observability, and engineering focus issues. Their custom system required constant firefighting, diverting engineers from building AI capabilities. After migrating, they implemented two scaling strategies using Temporal metrics: slot-based scaling with temporal_worker_task_slots_used and latency-based fallbacks using schedule-to-start latency metrics at the 95th percentile. They also built a Kubernetes liveness probe using SDK metrics to automatically restart stuck workers, eliminating periodic slowdowns and on-call pages. The migration allowed engineers to refocus on core product innovation rather than maintaining workflow infrastructure.

4m read timeFrom temporal.io
Post cover image
Table of contents
From fighting problems to building product #Scaling without the guesswork #Monitoring that eliminates firefighting #Refocusing engineering on what matters most #

Sort: