MLflow introduces three new capabilities for evaluating AI agents: Tunable Judges for creating custom LLM evaluators using natural language instructions, Agent-as-a-Judge for automatically identifying relevant trace data without manual parsing, and Judge Builder for visual judge management with domain expert feedback. These

6m read timeFrom databricks.com
Post cover image
Table of contents
Creating Tunable JudgesGetting Started with Agent-as-a-Judge
2 Comments

Sort: