Meta: MemAlign aligns LLM judges with human feedback using scalable memory, delivering state-of-the-art quality with 10–100× lower cost and latency.

databricks

MemAlign is a new framework that aligns LLM judges with human feedback using a dual-memory system (semantic and episodic). Unlike traditional approaches requiring hundreds of labeled examples or expensive fine-tuning, MemAlign learns from just 2-10 natural language feedback examples. It achieves competitive or better quality than state-of-the-art prompt optimizers (MIPROv2, SIMBA, GEPA) while being 10-100× faster and cheaper. The system demonstrates "memory scaling" where quality improves as feedback accumulates without re-optimization. MemAlign is now available in open-source MLflow and on Databricks, enabling rapid alignment of LLM judges to domain-specific standards through interactive feedback loops.

MemAlign: Building Better LLM Judges From Human Feedback With Scalable Memory

The Problem: LLM Judges Don’t Think Like Domain Experts

Introducing MemAlign: Alignment Through Memory, Not Weight Updates

Performance: MemAlign vs. Prompt Optimizers

Under The Hood: What Makes MemAlign Work?