DigitalOcean App Platform now supports request-based autoscaling as a generally available feature. Unlike CPU-based autoscaling (a lagging indicator), this new approach scales containers based on live HTTP signals: requests per second per instance and P95 response latency. It works on both shared and dedicated CPU plans, removing the previous restriction that required a dedicated CPU plan. Users can combine request-based and CPU-based metrics on dedicated plans. Configuration is available via the App Platform console or app spec YAML, with options to set min/max container counts and thresholds for RPS and latency. Scaling decisions use a 5-minute rate window to avoid reacting to momentary spikes.
Table of contents
Now Available for Shared and Dedicated CPU InstancesKnow Your Baseline Before You Set ThresholdsHow to Configure Request-Based AutoscalingGet Started With Request-Based AutoscalingSort: