BaNEL (Bayesian Negative Evidence Learning) is a novel algorithm that trains generative models using only failed attempts, addressing the challenge of extremely sparse rewards in hard problems like theorem proving and drug discovery. By learning patterns from failures through a separate generative model, BaNEL achieves up to 278x improvement in success rates while minimizing expensive reward evaluations. The approach trades compute for reward efficiency, demonstrating that machines can learn from negative evidence alone—similar to how human scientists generalize from past mistakes.

8m read timeFrom blog.ml.cmu.edu
Post cover image
Table of contents
Tackling Very Hard ProblemsLearning from Negative RewardsLearning a Generative Model of FailuresOnline Recursive UpdateExperiment: Adversarial Attack On Toy Language ModelExperiment: Language Model ReasoningClosing Remarks

Sort: