Dark Factory: OpenClaw Ships Faster Than You Can Read the Diff — Vincent Koc, Comet ML
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
A conference talk by Vincent Koc from Comet ML arguing that traditional static AI benchmarks and evaluations are insufficient for modern agentic AI systems. He traces the evolution from prompt engineering to context engineering to 'intent engineering,' where AI systems self-optimize based on user intent. The core argument is that evals need to become adaptive and living systems rather than static datasets — using agent traces, always-on evaluation, and telemetry-in-the-loop to continuously update test suites as applications and user behavior evolve. He introduces the concept of 'eval calcification' to describe the growing gap between static benchmarks and dynamic agentic behavior, and advocates for treating evaluations as self-optimizing agents rather than fixed data points.
Sort: