Towards Data Science is a community-powered publication that showcases work in data science, machine learning and artificial intelligence. Every day newcomers, seasoned researchers and industry practitioners publish tutorials, research notes and real-world case studies that help the field move forward.

Towards Data Science

An 85%-accurate AI agent on a 10-step task succeeds only ~20% of the time due to compound probability — a math problem most engineering teams never run before shipping. Drawing on real incidents (Replit's deleted production database, OpenAI Operator's unauthorized purchase), the post explains Lusser's Law applied to LLM agents, exposes the gap between benchmark scores and real-world performance (SWE-bench Verified at 79% vs. SWE-bench Pro at 17.8%), and provides a 4-check pre-deployment framework: run the compound calculation, classify task reversibility, discount benchmark numbers by 30–75%, and test for error recovery rather than just task completion.

The Math That’s Killing Your AI Agent