Request-based autoscaling on DigitalOcean App Platform is now generally available. Apps can now automatically scale based on live HTTP traffic signals (requests per second and P95 response latency) so your infrastructure reacts in real-time.

DO (DigitalOcean) provides insights into cloud computing, infrastructure as code, and developer tools, offering tutorials and documentation for deploying and managing applications on the cloud. By exploring DO's curated content, developers can learn about cloud-native architectures, Kubernetes deployment patterns, and best practices for building scalable and resilient applications. Whether you're a startup founder, indie developer, or enterprise IT professional, DO offers resources to accelerate your cloud journey and optimize your infrastructure for success.

DigitalOcean

DigitalOcean App Platform now supports request-based autoscaling as a generally available feature. Unlike CPU-based autoscaling (a lagging indicator), this new approach scales containers based on live HTTP signals: requests per second per instance and P95 response latency. It works on both shared and dedicated CPU plans, removing the previous restriction that required a dedicated CPU plan. Users can combine request-based and CPU-based metrics on dedicated plans. Configuration is available via the App Platform console or app spec YAML, with options to set min/max container counts and thresholds for RPS and latency. Scaling decisions use a 5-minute rate window to avoid reacting to momentary spikes.

Request-Based Autoscaling Is Now Generally Available on App Platform

Now Available for Shared and Dedicated CPU Instances

Know Your Baseline Before You Set Thresholds

How to Configure Request-Based Autoscaling

Get Started With Request-Based Autoscaling