A tutorial on how to scale the resources in a Vespa application to increase feed throughput. Using the metrics dashboard for informed and optimised scaling.

Vespa Blog

A hands-on walkthrough of scaling a Vespa Cloud application to maximize document feed throughput using the MS MARCO passages dataset (8.8M documents). Starting from a minimal single-node setup, the tutorial progressively scales container nodes with GPUs and content nodes, using the Vespa metrics dashboard to identify bottlenecks at each step. The final configuration uses 100 GPU container nodes and 40 content nodes, achieving ~7,358 documents/second and ingesting the full dataset with embeddings (E5, ColBERT) in just over 20 minutes.

Scaling a Vespa Application: Feeding Fast and Furiously