Your coding agent writes code—but not like your team. RL has boosted base models, but it’s opaque and hard to scale across enterprises. Most agents still rely on brittle, hand-edited system prompts or style guides (e.g., agent.md)—what if your agent learned from your reviews and updated them automatically? In this talk, I’ll show a system-prompt learning loop—RL techniques applied to prompts, not model weights—that continually tunes an agents.md, so the agent learns instructions from your PR's, feedback & evaluations. You’ll leave with a concrete recipe to capture runtime signals, and auto-tune system prompts—applicable to any type of agent you’re building.

Speakers: 
Aparna Dhinakaran  |  Co-founder & CPO, Arize
https://x.com/aparnadhinak
https://www.linkedin.com/in/aparnadhinakaran/

AI Engineer

System prompt learning is an iterative approach to improving AI coding agents by using English feedback from evaluations rather than scalar rewards. The technique was tested on Claude and Cline coding agents using SWEBench benchmarks, where an LLM-as-judge evaluated failures and generated explanations. These explanations were fed into a meta-prompt to refine system prompts and rules. Results showed 5% improvement for Claude Code and 15% for Cline on GitHub issue resolution using only 150 training examples. The approach proved more sample-efficient than traditional reinforcement learning and required fewer iterations than DSPY's MIPROv2 optimizer, with success heavily dependent on well-engineered evaluation prompts.

Continual System Prompt Learning for Code Agents – Aparna Dhinakaran, Arize