Best of Monitoring — 2025

1
Article
Jeff Geerling·1y
Top 10 ways to monitor Linux in the console
Explore several modern and useful tools for monitoring Linux system performance through the console. From the basic 'top' utility to more advanced tools like 'htop', 'atop', 'iftop', 'iotop', 'nvtop', and 'btop', each tool offers unique features for CPU, network, disk, and GPU monitoring. Learn installation commands and get insights on when to use each tool for optimal system monitoring.
326
5
2
Article
Platformatic·1y
Watt Admin: Your Local Node.js Monitoring Solution
Watt Admin is an open-source monitoring and administration tool for Platformatic applications, providing real-time performance metrics, comprehensive logging, and full control over services. Key features include real-time memory and CPU usage monitoring, latency tracking, centralized log viewing and filtering, service management capabilities, and CLI integration. It is easy to set up with a simple command and is ideal for development diagnostics.
203
3
3
Article
Last9·48w
11 Best Log Monitoring Tools for Developers in 2025
A comprehensive comparison of 11 log monitoring tools for developers in 2025, covering solutions from simple centralized logging (Papertrail) to enterprise-scale platforms (Datadog, Dynatrace). The guide evaluates each tool's strengths, limitations, pricing, and ideal use cases, while providing practical advice on choosing the right solution based on team size, log volume, and technical requirements. Key tools covered include Last9, Better Stack, Grafana Loki, Elastic Stack, and others, with emphasis on real-world implementation considerations like structured logging, query performance, and cost optimization.
120
3
4
Article
FREEK.DEV·1y
Naming things is suddenly easier
Oh Dear is an all-in-one monitoring tool that provides uptime monitoring, SSL certificate checks, broken link detection, and more, with a user-friendly API and excellent documentation. The author shares programming tips, tutorials, and opinions through various platforms and a monthly newsletter, focusing on Laravel and modern PHP development.
111
8
5
Article
The Verdict·30w
ctop = htop for containers
ctop is a terminal-based monitoring tool for Docker containers, inspired by htop. It provides a clean visual interface to check container statistics, access logs, and customize display columns directly from the command line.
96
17
6
Article
freeCodeCamp·47w
Top Application Monitoring Tools for Developers
Application Performance Monitoring (APM) tools help developers detect issues before users report them. Five key tools are compared: New Relic offers comprehensive full-stack observability with real-time metrics and traces; Datadog excels in cloud-native environments with seamless integrations and powerful alerting; Prometheus + Grafana provides open-source flexibility with custom dashboards and PromQL querying; Sentry specializes in error tracking with detailed stack traces and breadcrumbs; PostHog combines product analytics with session recording and feature flags. For small teams, start with Sentry for errors and Prometheus for metrics, then consider unified solutions like Datadog or New Relic as you scale.
92
1
7
Article
A Java geek·26w
My first real Rust project
A developer shares their experience building their first production Rust project: a health monitoring component that polls sensors and sends email alerts. The post covers the technical rationale for choosing Rust over JVM languages (short-lived processes, cross-compilation, memory efficiency), selecting appropriate crates (reqwest, config, lettre), leveraging Rust's derive macros for trait implementation, and troubleshooting Windows compilation issues with the GNU toolchain.
80
1
8
Article
Hacker News·49w
CI/CD Observability with OpenTelemetry - A Step by Step Guide
OpenTelemetry can provide comprehensive observability for CI/CD pipelines by capturing traces and metrics from GitHub Actions workflows. The setup involves configuring the OpenTelemetry Collector with a GitHub receiver that ingests webhook events as traces and scrapes repository metrics via GitHub APIs. This approach enables end-to-end visibility, performance optimization, error detection, and dependency analysis for CI/CD pipelines, replacing traditional ad-hoc monitoring methods with a unified observability framework.
79
1
9
Article
Product Hunt·24w
Console.text(): SMS alerts for your code just like console.log()
Console.text() is a lightweight monitoring tool that sends SMS alerts when specific code paths execute. It requires just one line of code (npm package installation) and offers a simpler alternative to enterprise solutions like Sentry or PagerDuty. The service provides 50 free messages for testing with rate limiting of 10 unique messages per 5-minute window, targeting solo developers and small projects that need basic production alerts without complex setup.
73
12
10
Article
Grafana Labs·1y
Kubernetes Monitoring Helm chart 2.0: a simpler, more predictable experience
Version 2.0 of the Kubernetes Monitoring Helm chart improves ease of use and flexibility in collecting telemetry data from Kubernetes clusters. Key updates include user-focused feature design, multiple data destinations, built-in integrations for popular services, and compatibility with Fleet Management. Simplified migration from version 1.x is supported by a detailed guide and migration utility.
67
11
Article
Programming Digest·39w
How to Keep Services Running During Failures?
Graceful degradation is a design principle that allows systems to maintain essential functionality during failures by operating at reduced capacity rather than crashing completely. Key strategies include rate limiting to control traffic, request coalescing to reduce duplicate queries, load shedding to prioritize critical requests, retry mechanisms with jitter to prevent thundering herd problems, circuit breakers to isolate failing services, request timeouts to prevent resource exhaustion, and comprehensive monitoring with alerting for proactive issue detection.
56
12
Article
Hacker News·31w
Comparing the power consumption of a 30 year old refrigerator to a brand new one
A comparison of power consumption between a 30-year-old UPO Jääkarhu refrigerator and a modern replacement using smart plug monitoring. The old unit consumed 2.6 kWh daily versus 0.7 kWh for the new one—a 3.7x difference. Monthly savings of approximately 57 kWh translate to a payback period of about 38 months at 17 cents per kWh. The analysis demonstrates practical IoT monitoring applications for home energy optimization and appliance replacement decisions.
51
5
13
Article
Community Picks·51w
Metlo
Metlo is an open-source API security tool that provides real-time protection against malicious attacks. It automatically discovers and inventories API endpoints, detects threats like SQL injection and XSS attacks with minimal false positives, and blocks malicious traffic in real time. The tool integrates with various programming languages and platforms, can be deployed in under 15 minutes, and processes traffic with less than 0.2ms latency increase while using minimal system resources.
44
5
14
Article
Milan Jovanović·48w
Monitoring .NET Applications with OpenTelemetry and Grafana
Learn how to implement comprehensive observability for .NET applications using OpenTelemetry and Grafana Cloud. The guide covers installing OpenTelemetry packages, configuring automatic instrumentation for ASP.NET Core, Entity Framework, and other libraries, setting up OTLP export to Grafana Cloud, and viewing traces and logs in unified dashboards. This setup provides distributed tracing, log correlation, and monitoring capabilities that scale from single services to complex microservice architectures.
40
15
Article
Microservices.io·42w
Microservices rules #9: Develop observable services
Part of a comprehensive microservices rules series, this installment focuses on developing observable services as a critical architectural principle. Observability enables fast flow in microservices environments by providing visibility into system behavior, performance, and issues. The rule emphasizes the importance of designing services with built-in observability capabilities from the ground up, rather than adding monitoring as an afterthought.
39
16
Article
Blueground Engineering·36w
A Software Engineer’s Guide to Observability
A comprehensive guide to observability for engineering teams, covering the three pillars (logging, tracing, metrics) and their practical applications. Explains why observability has become critical in the era of distributed systems and AI-generated code, where complexity is increasing while domain expertise is becoming more distributed. The guide focuses on understanding when and why to use different observability tools rather than just how to configure them.
38
17
Article
Platformatic·23w
Node.js CPU and Heap Profiling with Shareable Flame Graphs
Watt Admin 1.0.0 introduces Recording Mode for Node.js applications running on Platformatic Watt. The feature enables capturing complete performance sessions with CPU and heap profiling, generating interactive flame graphs, and packaging everything into a single offline HTML file. It collects comprehensive metrics including memory usage, CPU utilization, event loop health, HTTP performance, and connection pool statistics. Developers can record sessions during specific scenarios, analyze bottlenecks through flame graphs, and share the self-contained HTML bundles with team members without requiring any setup. The tool uses V8's built-in profilers and stores data in pprof format for industry-standard analysis.
35
18
Article
Charity·26w
From Cloudwashing to O11ywashing
The term 'observability' has been co-opted by vendors to mean traditional monitoring tools that only track system uptime, losing its original meaning of understanding service quality from each customer's perspective. This 'o11ywashing' mirrors the 'cloudwashing' phenomenon where vendors rebrand existing products with trendy terminology. True observability requires unified telemetry combining app, business, and system data to slice by customer ID and other dimensions, not just separate metrics, logs, and traces. Engineering executives need better education on this distinction to avoid investing in rebranded monitoring tools that can't solve their actual problems.
33
1
19
Article
The New Stack·1y
Best Practices for Monitoring Network Conditions in Mobile
Monitoring network conditions in mobile apps is crucial for maintaining performance. Network issues can significantly degrade user experience by increasing wait times, causing data synchronization issues, transaction failures, and higher battery consumption. Effective network monitoring should include testing under various conditions, instrumenting end-to-end user flows, critically assessing SDK usage, and correlating network performance with other data. A comprehensive approach helps ensure robust app performance despite connectivity challenges.
32
20
Article
Datadog·51w
Announcing Go tracer v2.0.0
Datadog releases Go tracer v2.0.0 with a simplified API, enhanced security through modular dependencies, and improved developer experience. Key changes include new import URLs using github.com instead of gopkg.in, config structs for better performance, and independent modules for integrations. A transitional v1.74.0 version allows gradual migration for large codebases, while a migration tool simplifies the upgrade process.
31
21
Article
Last9·1y
How to Effectively Monitor Nginx and Prevent Downtime
Monitoring Nginx is critical to ensuring its performance and stability. Key metrics include traffic, connection metrics, performance metrics, and system resource utilization. Tools like Prometheus, Grafana, and built-in Nginx modules help track these metrics. Effective monitoring minimizes downtime and improves the incident response. Choosing the right tool and setting up alerts for critical events are essential steps towards maintaining a high-performing, reliable Nginx deployment.
31
1
22
Article
Grafana Labs·1y
The latest in Kubernetes Monitoring: new features to track persistent storage, simplify alerting, and more
Grafana Cloud has introduced several new features in Kubernetes Monitoring, including enhanced storage observability, easier alert creation, and improved Fleet Management. Key updates include tracking persistent volumes, seamless navigation, streamlined troubleshooting with historical data, and advanced cost management tools. Version 2.0 of the Kubernetes Monitoring Helm chart simplifies data collection and configuration, supporting multiple telemetry destinations and built-in integrations.
30
23
Article
selfhosted·23w
Monitoring a Docker Homelab with Open Source
A detailed guide for setting up observability in a Docker-based homelab using Coroot, an open-source monitoring solution. The tutorial covers installing Clickhouse as a local service for storing metrics, logs, traces, and profiles, then configuring Coroot with its node-agent and cluster-agent via Docker Compose. The setup uses eBPF for automatic metric collection, includes AI-powered root cause analysis, and provides configuration optimizations for memory usage and data retention (14 days). The guide offers an alternative to traditional Prometheus/Grafana stacks with less configuration overhead.
29
2
24
Article
Medium·50w
Your laptop can run a full devops stack here’s how I set mine up
A comprehensive guide to building a complete DevOps stack locally using Docker Compose, covering Git servers (Gitea), CI/CD tools (Jenkins/Drone), monitoring (Prometheus/Grafana), and container registries. The setup eliminates cloud costs while providing hands-on experience with real DevOps tools and workflows. Includes hardware requirements, common mistakes to avoid, and practical configuration examples for running everything on a laptop.
29
2
25
Article
Product Hunt·29w
Helicone AI: Open-source LLM Observability for Developers
Helicone is an open-source platform that provides observability and monitoring for AI applications using large language models. It offers a unified API gateway that consolidates access to 100+ models from multiple providers through a single API key, with zero markup fees. Key features include automatic failover, built-in caching, custom rate limits, real-time analytics, and OpenAI SDK compatibility. The platform addresses common challenges like provider outages, rate limiting, and managing multiple API integrations while providing full visibility into performance and costs.
28

See all Monitoring archives