DigitalOcean App Platform now supports request-based autoscaling as a generally available feature. Unlike CPU-based autoscaling (a lagging indicator), this new approach scales containers based on live HTTP signals: requests per second per instance and P95 response latency. It works on both shared and dedicated CPU plans, removing the previous restriction that required a dedicated CPU plan. Users can combine request-based and CPU-based metrics on dedicated plans. Configuration is available via the App Platform console or app spec YAML, with options to set min/max container counts and thresholds for RPS and latency. Scaling decisions use a 5-minute rate window to avoid reacting to momentary spikes.

4m read timeFrom digitalocean.com
Post cover image
Table of contents
Now Available for Shared and Dedicated CPU InstancesKnow Your Baseline Before You Set ThresholdsHow to Configure Request-Based AutoscalingGet Started With Request-Based Autoscaling

Sort: