Best of ObservabilityFebruary 2026

  1. 1
    Article
    Avatar of phProduct Hunt·8w

    claude-devtools: See everything Claude Code hides from your terminal

    claude-devtools is an open-source tool that visualizes hidden Claude Code activity by parsing local session logs. It reconstructs file operations, tool calls, diffs, and token usage into a visual timeline with context attribution, compaction visualization, subagent execution trees, and custom notifications. Unlike GUI wrappers, it doesn't modify Claude Code—it simply reads existing logs to reveal what the CLI doesn't show.

  2. 2
    Article
    Avatar of langchainLangChain·7w

    Agent Observability Powers Agent Evaluation

    Agent observability differs fundamentally from traditional software observability because agents are non-deterministic — you can't predict behavior until runtime. This post explains why debugging agents means debugging reasoning rather than code, introduces three core observability primitives (runs, traces, threads), and shows how these primitives map directly to three levels of agent evaluation: single-step (unit tests for decisions), full-turn (end-to-end trajectory), and multi-turn (context persistence across sessions). Production traces serve triple duty: manual debugging, building offline evaluation datasets from real failures, and powering continuous online evaluation. The key insight is that observability and evaluation are inseparable for agents — traces are the only source of truth for what an agent actually did.

  3. 3
    Article
    Avatar of clickhouseClickHouse·9w

    Is it over for metrics?

    Traditional metrics are shifting from the center of observability stacks to an optimization layer. While metrics remain useful for known failure modes and system-level signals like CPU and memory, they struggle with high-cardinality debugging and require pre-defining what to measure. Modern columnar databases like ClickHouse enable efficient rollups over rich, structured event data, allowing engineers to store high-fidelity logs and traces that can be aggregated on-demand. This approach moves curation from development time to investigation time, making metrics a performance optimization rather than the primary interface for understanding production systems.

  4. 4
    Article
    Avatar of confConfluent Blog·8w

    Apache Kafka 4.2.0 Released: Share Groups, Streams & More

    Apache Kafka 4.2.0 is now available, bringing several major improvements. Share Groups (Kafka Queues) are now production-ready, featuring a new RENEW acknowledgement type for extended processing, adaptive batching for share coordinators, configurable fetch record limits, and comprehensive lag metrics. Kafka Streams gains GA status for its server-side rebalance protocol, dead letter queue support in exception handlers, anchored wall-clock punctuation for deterministic scheduling, and explicit control over leave-group behavior on shutdown. Observability is improved with standardized CLI arguments, corrected metric naming following the kafka.COMPONENT convention, and new idle ratio metrics for controllers and MetadataLoader. Security enhancements include an allowlist connector client configuration override policy and thread-safety fixes to RecordHeader. Additional changes cover external schema support in JsonConverter, dynamic remote log manager thread pool configuration, adaptive batching in group coordinators, and rack ID exposure in the Admin API.