MLflow 3.10 introduces multi-turn evaluation and conversation simulation for chatbots and AI agents. The release adds built-in session-level scorers like ConversationCompleteness and UserFrustration that assess entire conversations rather than individual responses. A ConversationSimulator lets developers define persona-based
Table of contents
What is User Simulation for Multi-turn Conversations? The Setup Scoring Existing Sessions Scaling Multi-turn Agent Evaluation with Simulation What's Next Resources and References Sort: