Best of Prometheus — 2024

1
Article
Community Picks·2y
DevOps Monitoring and Automation Tool using Jenkins, Prometheus, Grafana and Docker
A comprehensive guide on setting up a DevOps monitoring and automation tool using Jenkins, Prometheus, Grafana, and Docker. It covers the installation and configuration steps for each tool, including the setup of Jenkins pipelines for continuous integration and deployment, the use of Prometheus for metrics collection, and Grafana for visualization. The tutorial also provides Docker commands and scripts for a seamless setup.
173
2
2
Article
Last9·2y
PromQL Cheat Sheet: Must-Know PromQL Queries
PromQL can be challenging but highly effective for monitoring and troubleshooting system performance. This guide offers essential PromQL queries to help you analyze real-time data, detect trends, identify resource-intensive services, track SLOs/SLIs, manage high cardinality, plan capacity, and perform multi-cluster queries. These snippets aim to make your life easier when working with Prometheus dashboards.
49
3
Article
Prometheus·2y
Announcing Prometheus 3.0
Prometheus 3.0 is now available, marking the first major release in seven years. Key updates include a new UI, Remote Write 2.0, UTF-8 support, and enhanced interoperability with OpenTelemetry. Native histograms are introduced as an experimental feature. The release also includes some breaking changes, so users are encouraged to review the migration guide. Performance improvements and upcoming features were also highlighted.
42
1
4
Article
AT&T Israel·2y
Memory Leak Profiling and Pinpointing for Node.js
We identified and fixed a memory leak in a Nest.js application running on Kubernetes, which was caused by open handles not being properly closed during proxy requests to Grafana. Using Prometheus and Grafana for monitoring, along with custom modifications to the wtf-node dependency, we pinpointed the issue and adjusted the middleware to prevent the memory leak. Key steps include integrating prom-client with Grafana and using wtf-node to diagnose active handle issues.
35
5
Video
YouTube·2y
Go (fiber) vs. Go (stdlib) vs. Go (gin): Performance Benchmark in Kubernetes
The post compares the performance of Golang HTTP frameworks—Fiber, Gin, and the Golang standard library—within a Kubernetes environment. By deploying the applications on an AWS cluster and measuring their CPU, memory usage, client-side latency, and requests per second, the test finds that Fiber performs best in terms of resource usage and latency. However, the standard library is recommended for general use due to its reliability and broad suitability for most applications.
35
6
Article
Grafana Labs·2y
Prometheus 3.0 and OpenTelemetry: a practical guide to storing and querying OTel data
Prometheus 3.0 aims to improve integration with OpenTelemetry by addressing challenges such as resource attributes, UTF-8 support, and temporalities. The Prometheus 3.0 release includes features like promoting resource attributes to metric labels, a new `info` PromQL function, and stable OTLP support for easier data ingestion and querying. Users can also utilize the delta to cumulative processor in OTel Collector for better data handling. Future developments will focus on enhancing interoperability and scalability.
29
7
Article
Faun·2y
Learning Go by Instrumenting a Go Application for Prometheus Metrics
A beginner's guide to learning Go by instrumenting a Go application for Prometheus metrics. This tutorial covers building a Prometheus metrics exporter to consolidate analytics and metrics from Datadog's SLO product. Key steps include parsing the Datadog API response, creating necessary structs in Go, declaring Prometheus metrics, initializing the Datadog client, fetching SLO data, and pushing the metrics to Prometheus.
26
8
Article
Javarevisited·1y
Top 6 Courses to Learn Prometheus in 2025
Prometheus has become essential for monitoring and alerting in cloud-native environments. This post lists the top six Prometheus courses available in 2025 from Udemy, Pluralsight, and Coursera, catering to all experience levels. These courses cover installation, setup, mastering PromQL, and integration with other tools like Grafana, aiding DevOps professionals and software engineers to enhance their observability skills and career prospects.
24
9
Article
Grafana Labs·2y
How to use Prometheus to efficiently detect anomalies at scale
Discover how an effective anomaly detection framework was built using Prometheus and PromQL. Learn about setting up average and standard deviation recording rules, tuning parameters like time windows and multipliers, and addressing challenges such as extreme outliers, low sensitivity, and seasonality. The reusable framework works for any metric and can integrate with your existing Prometheus setup to enhance incident investigation and root-cause analysis.
23
10
Article
Last9·2y
Prometheus Alertmanager: What You Need to Know
Prometheus Alertmanager helps manage alerts in a production environment by organizing, routing, and deduplicating alerts, thereby reducing alert fatigue. It supports features like alert grouping, silencing, inhibition, and high availability setups. To effectively use Prometheus Alertmanager, ensure to configure alert conditions properly, use grouping and inhibition to avoid notification spam, and implement security best practices such as authentication and TLS encryption. Periodically review and audit alerts to keep configurations relevant and improve upon past incident learnings.
21
11
Article
Grafana Labs·2y
Monitoring Kubernetes: Why traditional techniques aren't enough
Kubernetes offers significant advantages for large-scale deployment and management of applications, but traditional monitoring techniques fall short. Observability now leverages out-of-the-box solutions like Prometheus, Grafana, and OpenCost to facilitate proactive monitoring, cost management, and better resource allocation. The Kubernetes ecosystem makes it easier for teams to support application performance, though engineers must still be vigilant about costs and reliability.
20
12
Article
Last9·2y
Optimizing Prometheus Remote Write Performance: Guide
Prometheus remote write is pivotal for storing and querying long-term metrics as infrastructure scales. Common performance bottlenecks include high CPU and memory usage, network bandwidth consumption, and delayed metric availability. Optimization strategies focus on queue configuration, reducing cardinality, effective relabeling, network optimization, and choosing appropriate remote storage. Best practices include starting with conservative settings, continuous monitoring, and adjusting configurations based on observed performance.
20
13
Article
AWS Tip·2y
Monitoring using Prometheus and Grafana on AWS
Learn how to set up monitoring using Prometheus and Grafana on AWS. Install Docker, Docker Compose, and necessary containers. Access Portainer for container management. Access Prometheus for monitoring. Add Prometheus as a datasource in Grafana. Load a dashboard in Grafana and check the graphs.
20
14
Article
Hacker News·2y
shizunge/endlessh-go: A golang implementation of endlessh exporting Prometheus metrics, visualized by a Grafana dashboard.
A golang implementation of endlessh that exports Prometheus metrics and visualizes them in a Grafana dashboard. It can block brute force SSH attacks, trap attackers, and provide geolocation and other statistics of the sources of attacks. You can build and run it from source or use the provided docker image. The dashboard requires Grafana 8.2 and can be imported using ID 15156.
18
15
Article
Prometheus·2y
Prometheus 3.0 Beta Released
Prometheus 3.0-beta is now available for testing, featuring a completely rewritten UI, enhancements to Remote Write 2.0, expanded OpenTelemetry support, and experimental Native Histograms. Users are encouraged to test the beta and report any issues for a more stable final release. Notable additions include support for UTF-8 characters in metric and label names, and new configurations for OTLP ingestion.
17
1
16
Article
Cast AI·2y
cAdvisor: How To Monitor Kubernetes Containers Efficiently
cAdvisor is an open-source container monitoring tool by Google, capable of gathering and exporting metrics like CPU, memory, filesystem, and network utilization from containers. It can be easily deployed in Kubernetes using a DaemonSet. Despite some limitations, such as the need for additional configuration for custom hardware and external tools for advanced analysis, cAdvisor supports various storage plugins and has a built-in web UI for real-time metrics display. Optimizing its configuration and ensuring sufficient resources can help mitigate performance issues. Security measures like authentication and port customization are also recommended.
17
1
17
Article
Community Picks·2y
derailed/popeye: 👀 A Kubernetes cluster resource sanitizer
Popeye is a utility that scans live Kubernetes clusters and reports potential issues with deployed resources and configurations. It aims at reducing the cognitive overload one faces when operating a Kubernetes cluster in the wild. Popeye is a readonly tool that does not alter any of your Kubernetes resources.
17
18
Article
Community Picks·2y
castai/egressd: Kubernetes aware network traffic monitoring
castai/egressd is a Kubernetes-aware network traffic monitoring tool that uses a DaemonSet pod on each node to fetch conntrack entries for pods. It supports both Cilium eBPF maps and Linux Netfilter Conntrack module. The tool adds Kubernetes context to traffic records and can export logs to HTTP or Prometheus. Egressd operates as a privileged container to perform DNS tracing and conntrack entry fetching. The post includes a demo setup with Grafana and Prometheus, and additional instructions for exposing Grafana locally and running end-to-end tests.
16
19
Article
Last9·2y
kube-state-metrics: Your Complete Guide to Simplifying Kubernetes Observability
kube-state-metrics is an open-source add-on for Kubernetes that generates metrics about the state of various Kubernetes objects by listening to the Kubernetes API server. It complements other monitoring tools like metrics-server by providing insights into the health and status of Kubernetes resources such as pods, deployments, and nodes. Installation can be done using Helm, YAML manifests, or building from source. Integration with Prometheus allows for advanced querying and visualization using Grafana. Best practices include setting up appropriate RBAC permissions, enabling high availability, and leveraging custom resource metrics for enhanced observability.
16
20
Article
AWS in Plain English·2y
Ultimate Guide: Setting up Grafana with Docker Compose for Docker Container Monitoring
Learn how to set up Grafana with Docker Compose for effective monitoring of Docker containers. Follow a step-by-step guide to configure services, access Grafana, and create dashboards.
16
21
Article
Grafana Labs·2y
Grafana's Prometheus libraries: How we built libraries to create a truly vendor-neutral data source
Grafana Labs has decoupled its Prometheus data source from Grafana, enabling the creation of vendor-neutral data sources. This update includes a dedicated Amazon Managed Service for Prometheus plugin while maintaining core functionalities and stability. The change aims to foster better observability and reusability in the open-source community. Key principles like DRY and modularity were emphasized, and significant efforts ensured the code's independence and scalability. The project required extensive internal collaboration, POCs, and CI/CD enhancements to achieve this milestone.
15
22
Article
eBPF·2y
Monitoring Inter-Pod Traffic at the AZ Level with Retina (an eBPF based tool)
The post provides a comprehensive guide on analyzing inter-pod traffic in Kubernetes, with a focus on cross-AZ communication to reduce AWS costs. It leverages eBPF technology using tools like Retina, Prometheus, and Grafana. The article covers the setup of a sample environment, configuration of monitoring tools, and how to capture and analyze traffic metrics to optimize network performance across Kubernetes clusters.
14
23
Article
Faun·2y
A Beginner’s guide to Helm in Kubernetes.
A beginner's guide to Helm in Kubernetes, covering installation, terminologies, using Helm to install Prometheus, operations like uninstalling, updating, and rollback, and concluding with further exploration on creating Helm charts.
14
24
Article
Medium·2y
Monitoring the Golang App with Prometheus, Grafana, New Relic and Sentry
Learn how to monitor a Golang app using Prometheus, Grafana, New Relic, and Sentry. Discover what New Relic and Sentry are and how they help with observability and error tracking.
14
25
Article
Grafana Labs·1y
How to use OpenTelemetry and Grafana Alloy to convert delta to cumulative at scale
Migrating to a Prometheus-based ecosystem with Grafana Alloy, which integrates OpenTelemetry, makes handling metrics easier, especially when converting delta metrics to cumulative ones. This post details the algorithm used for this conversion and outlines the load balancing and container setup necessary for scalable deployment. It also mentions recent updates to the OpenTelemetry Collector, making the process more efficient and scalable.
13

See all Prometheus archives