Everything You Need To Know About Agent Observability — Danny Gollapalli and Ben Hylak, Raindrop
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
A talk and workshop on agent observability, covering why traditional eval-based testing is insufficient for production AI agents. Key topics include implicit signals (classifiers for user frustration, refusals, task failure) vs explicit signals (error rates, latency, cost), using regex patterns as cheap monitoring signals, self-diagnostics via a report tool that encourages agents to surface their own anomalous behavior, and A/B experimentation using semantic signals to measure the impact of prompt or model changes. The Raindrop platform is demonstrated as a production monitoring tool that provides trajectory visualization, alerting, clustering of user intents, and a triage agent that automatically investigates signal spikes.
Sort: