Best of ObservabilityNovember 2025

  1. 1
    Article
    Avatar of hnHacker News·22w

    The Grafana trust problem

    An experienced engineer shares their journey with Grafana's observability stack, detailing how frequent architectural changes, deprecations, and increasing complexity have eroded trust. Starting with simple Loki/Prometheus setups, they've witnessed rapid product churn—Grafana Agent deprecated within 2-3 years, OnCall discontinued, and Mimir 3.0 now requiring Kafka. The constant restructuring, incompatibilities with Prometheus Operator standards, and career-driven development pace make it difficult to maintain stable monitoring infrastructure. While acknowledging the technical quality of Grafana products, the author questions their long-term viability and considers alternatives like the kube-prometheus-stack with Thanos.

  2. 2
    Article
    Avatar of grafanaGrafana Labs·22w

    A Star Wars dashboard deep dive: How to build your next visualization in less than 12 parsecs

    A detailed walkthrough of building a Star Wars-themed Grafana dashboard, covering practical techniques like using stat panels for custom text styling, TestData plugin for simulating dynamic data, canvas panels for creating custom visualizations with animations, and styling approaches for visual consistency. Demonstrates how to create gauges, charts, maps, and custom layouts while explaining the technical implementation behind each component.

  3. 3
    Article
    Avatar of grafanaGrafana Labs·22w

    Understand, diagnose, and optimize SQL queries: Introducing Grafana Cloud Database Observability

    Grafana Cloud Database Observability is now in public preview, offering developers, SREs, and DBAs tools to understand, diagnose, and optimize SQL queries. The solution addresses the visibility gap in database performance by providing query-level insights, execution plans, wait event analysis, and AI-powered optimization suggestions. It supports MySQL and PostgreSQL, integrates with Grafana Alloy for telemetry collection, and correlates database metrics with application and infrastructure data for comprehensive system-wide performance analysis.

  4. 4
    Article
    Avatar of grafanaGrafana Labs·23w

    Grafana Mimir 3.0 release: performance improvements, a new query engine, and more

    Grafana Mimir 3.0 introduces a redesigned architecture that separates read and write operations using Apache Kafka as an asynchronous buffer, eliminating performance bottlenecks between ingestion and queries. The release features the Mimir Query Engine (MQE), which processes queries in a streaming fashion rather than bulk loading, reducing peak memory usage by up to 92%. These improvements deliver 15% lower resource usage in large clusters while maintaining faster query execution and higher reliability. The new ingest storage component ensures query spikes won't slow down data ingestion and vice versa, enabling independent scaling of each path.

  5. 5
    Article
    Avatar of influxdbInfluxData·21w

    Introducing the New Cloud Dedicated Admin UI

    InfluxData has released a major update to the Cloud Dedicated Admin UI, introducing live cluster observability dashboards with CPU, memory, and request rate metrics. The update includes redesigned navigation for quick access to databases, tables, and tokens, plus enhanced table schema browsing with column type filtering. Users can now monitor cluster performance across different time periods and switch between multiple accounts and clusters directly from the interface.

  6. 6
    Article
    Avatar of cncfCNCF·23w

    Announcing Vitess 23.0.0

    Vitess 23.0.0 introduces MySQL 8.4.6 as the default version, enhanced observability with new metrics for transaction routing and recovery tracking, and improved operational tooling for VTOrc. The release removes deprecated metrics and APIs, strengthens topology management with better Consul authentication requirements, and includes critical upgrade instructions for Operator users migrating from MySQL 8.0 to 8.4. Key improvements focus on production reliability, monitoring precision, and simplified deployment workflows for horizontally scaled MySQL workloads.

  7. 7
    Article
    Avatar of charityCharity·20w

    From Cloudwashing to O11ywashing

    The term 'observability' has been co-opted by vendors to mean traditional monitoring tools that only track system uptime, losing its original meaning of understanding service quality from each customer's perspective. This 'o11ywashing' mirrors the 'cloudwashing' phenomenon where vendors rebrand existing products with trendy terminology. True observability requires unified telemetry combining app, business, and system data to slice by customer ID and other dimensions, not just separate metrics, logs, and traces. Engineering executives need better education on this distinction to avoid investing in rebranded monitoring tools that can't solve their actual problems.

  8. 8
    Article
    Avatar of phProduct Hunt·23w

    Helicone AI: Open-source LLM Observability for Developers

    Helicone is an open-source platform that provides observability and monitoring for AI applications using large language models. It offers a unified API gateway that consolidates access to 100+ models from multiple providers through a single API key, with zero markup fees. Key features include automatic failover, built-in caching, custom rate limits, real-time analytics, and OpenAI SDK compatibility. The platform addresses common challenges like provider outages, rate limiting, and managing multiple API integrations while providing full visibility into performance and costs.

  9. 9
    Article
    Avatar of antonzAnton Zhiyanov·20w

    Go proposal: Goroutine metrics

    Go 1.26 introduces new runtime metrics for goroutine monitoring, including per-state goroutine counts (waiting, runnable, running, not-in-go) and active thread counts. These metrics help identify production issues like lock contention, syscall bottlenecks, and CPU saturation by tracking goroutine behavior through the runtime/metrics package. The counters enable observability systems to detect scheduler problems and performance regressions without requiring full tracing.

  10. 10
    Article
    Avatar of vercelVercel·22w

    Rollbar joins the Vercel Marketplace

    Rollbar is now available as a native integration on the Vercel Marketplace, enabling real-time error monitoring and observability for Vercel projects. The integration allows developers to automatically detect and track errors, connect issues to specific releases and commits, manage billing in one place, and maintain aligned environments with clean stack traces across both platforms.

  11. 11
    Article
    Avatar of elasticelastic·22w

    Elastic Stack 9.2.1 released

    Elastic Stack version 9.2.1 has been released with bug fixes and updates. The team recommends upgrading from previous versions, particularly 9.2.0, to this latest release. Full details of fixes and changes are available in the official release notes.

  12. 12
    Article
    Avatar of opentelemetryOpenTelemetry·23w

    OpenTelemetry eBPF Instrumentation Marks the First Release

    OpenTelemetry eBPF Instrumentation (OBI) has reached its first alpha release after being donated by Grafana Labs. OBI provides zero-code, automatic instrumentation for applications across all programming languages by operating at the protocol level using eBPF technology. It captures metrics and traces without requiring code changes, restarts, or performance impact, supporting protocols like HTTP/HTTPS, gRPC, SQL, Redis, MongoDB, and Kafka. While excellent for getting started with observability and instrumenting compiled binaries, it works best when combined with traditional OpenTelemetry SDKs, particularly for complex distributed tracing scenarios in certain languages and frameworks.

  13. 13
    Article
    Avatar of grafanaGrafana Labs·21w

    Grafana Play updates: A redesigned homepage to celebrate our community

    Grafana Play, a free sandbox environment for exploring Grafana, has launched a redesigned homepage that emphasizes community contributions and makes features more discoverable. The update includes new sections like Featured Grafana Contributor, Dashboard of the Month, and Play Launchpad with Grafana Arcade games. The redesign uses existing visualization plugins and aims to create a more welcoming, playful experience while highlighting creative dashboards from the community. Future plans include adding a full demo of the LGTM Stack (Loki, Grafana, Tempo, Mimir).