Introducing MemAlign, a new framework that aligns LLMs with human feedback via a lightweight dual-memory system, achieving competitive or better quality than state-of-the-art prompt optimizers, at orders-of-magnitude lower cost and latency.

mlflow

MemAlign is a new framework for aligning LLM judges with human feedback using a dual-memory system (semantic and episodic). It learns from small amounts of natural language feedback rather than requiring hundreds of labeled examples. Benchmarks show it achieves competitive or better quality than state-of-the-art prompt optimizers like DSPy's MIPROv2, SIMBA, and GEPA, while being orders of magnitude faster (seconds vs. minutes) and cheaper ($0.03 vs. $1-5). The system exhibits "memory scaling" where quality improves as feedback accumulates without re-optimization. MemAlign is now available in open-source MLflow and works with various LLMs, showing meaningful improvement with just 2-10 examples.

MemAlign: Building Better LLM Judges From Human Feedback With Scalable Memory

The Problem: LLM Judges Don't Think Like Domain Experts ​

Introducing MemAlign: Alignment Through Memory, Not Weight Updates ​

Performance: MemAlign vs. Prompt Optimizers ​

Under The Hood: What Makes MemAlign Work? ​