Build High-Quality, Domain-Specific Agents at 95% Lower Cost
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
Databricks introduces token-based pricing for MLflow GenAI evaluation, reducing costs by up to 95% compared to fixed-block pricing. The platform now supports custom judges using any LLM provider (OpenAI, Anthropic, or fine-tuned models) and open-sources production-tested evaluation prompts validated across finance, healthcare, and technical documentation domains. Teams can evaluate agents across metrics like correctness, faithfulness, relevance, and safety while maintaining full control over evaluation logic and scaling to production workloads.
Table of contents
New Token-Based Pricing Model for Predefined JudgesOpen-Sourcing Battle-Tested Evaluation PromptsBeyond Built-in Judges: Bring Your Own ModelSort: