System prompt learning is an iterative approach to improving AI coding agents by using English feedback from evaluations rather than scalar rewards. The technique was tested on Claude and Cline coding agents using SWEBench benchmarks, where an LLM-as-judge evaluated failures and generated explanations. These explanations were fed into a meta-prompt to refine system prompts and rules. Results showed 5% improvement for Claude Code and 15% for Cline on GitHub issue resolution using only 150 training examples. The approach proved more sample-efficient than traditional reinforcement learning and required fewer iterations than DSPY's MIPROv2 optimizer, with success heavily dependent on well-engineered evaluation prompts.
•10m watch time
Sort: