Best of Observability — December 2025

1
Article
Hacker News·22w
Your Logs Are Lying To You
Traditional logging practices fail in modern distributed systems because they produce fragmented, context-poor log lines that are difficult to search and correlate. The solution is "wide events" (also called canonical log lines): emitting one comprehensive, structured event per request per service that contains all relevant context—user data, business metrics, infrastructure details, and error information. This approach transforms debugging from text searching into structured querying, enabling complex questions to be answered with simple SQL-like queries. Key implementation strategies include building events throughout the request lifecycle, using tail-based sampling to keep all errors while sampling successful requests, and deliberately instrumenting code with business context rather than relying on auto-instrumentation alone.
154
3
2
Article
Charity·23w
Moving from WordPress to Substack
A developer announces their migration from WordPress to Substack after a decade of blogging. The move is motivated by frustration with WordPress and the desire to join the more vibrant tech writing community on Substack. The author is working on the second edition of "Observability Engineering" and plans to share insights from that process. Email subscribers are being migrated, but comments cannot be transferred, and the original site will remain accessible to preserve existing links.
60
8
3
Article
CNCF·24w
Building microservices the easy way with Dapr
Dapr is a CNCF graduated project that simplifies microservices development by providing a sidecar runtime that handles distributed system concerns like messaging, pub-sub, service communication, storage, and secrets management. Built with observability in mind, Dapr automatically propagates traces and metrics across asynchronous and synchronous systems without requiring manual instrumentation. Recent additions include workflow orchestration, AI/LLM integration through a Conversation API, and Dapr Agents for durable autonomous workflows. The project was open source from inception, joined CNCF as an incubating project in 2021, and graduated in October 2024 with thousands of contributors from hundreds of organizations.
50
4
Article
mlflow·21w
AI Observability for Every TypeScript LLM Stack
MLflow 3.6 introduces automatic tracing integrations for TypeScript and JavaScript LLM frameworks including Vercel AI SDK, LangChain.js, LangGraph.js, Mastra, Anthropic, and Gemini. These integrations use OpenTelemetry to send traces to MLflow's tracking server, capturing prompt/response payloads, token usage, tool results, and errors. Setup requires minimal configuration—typically just pointing an OTLP endpoint to your MLflow server and wrapping SDK clients. MLflow can be deployed via Docker Compose or managed cloud services, eliminating the need for a Python environment alongside JavaScript stacks.
35
5
Article
selfhosted·23w
Monitoring a Docker Homelab with Open Source
A detailed guide for setting up observability in a Docker-based homelab using Coroot, an open-source monitoring solution. The tutorial covers installing Clickhouse as a local service for storing metrics, logs, traces, and profiles, then configuring Coroot with its node-agent and cluster-agent via Docker Compose. The setup uses eBPF for automatic metric collection, includes AI-powered root cause analysis, and provides configuration optimizations for memory usage and data retention (14 days). The guide offers an alternative to traditional Prometheus/Grafana stacks with less configuration overhead.
29
2
6
Article
Last9·21w
Why High-Cardinality Metrics Break Everything
High-cardinality metrics promise granular per-request, per-user insights but quietly break production systems in four ways: costs become unpredictable and scale with runtime behavior rather than configuration; queries slow down during incidents when speed matters most; engineers lose trust as sparse, short-lived series create flickering dashboards and inconsistent results; and teams over-instrument without intent, creating multiplicative cardinality explosion. The core issue isn't that high-cardinality is wrong, but that most observability systems don't surface their own limits around storage, indexing, query performance, and data ambiguity. Success requires treating high-cardinality metrics like APIs with explicit ownership, guardrails, pre-deployment cardinality estimation, and systems designed for interactive exploration under pressure rather than brute-force scans.
12
1
7
Article
Foojay.io·24w
OpenTelemetry Guide
Spring Boot 4 introduces native OpenTelemetry support through a single starter dependency, simplifying observability implementation. The guide covers configuring metrics, traces, and logs using the OTLP protocol, including step-by-step setup with Micrometer integration, Logback appender configuration, and Docker Compose testing with Grafana. This eliminates the need for multiple dependencies and Java agents required in Spring Boot 3, while providing seamless integration with GraalVM and AOT compilation.
10

See all Observability archives