A guide to using the Vespa Cloud metrics dashboard for production troubleshooting. Covers a three-question framework (system health, latency source, resource saturation) and walks through the Overview, Query, Feed, and Resources tabs. Also highlights recent additions: a Health Indicators row with five stat panels, new annotations for service restarts and core dumps, per-configuration container thread pool rows, and a JVM memory breakdown separating heap, direct, and native memory. Includes a practical incident workflow for moving from symptom to bottleneck quickly.

5m read timeFrom blog.vespa.ai
Post cover image
Table of contents
Start with three questionsWhat’s new in the latest revisionA simple workflowSummary

Sort: