•Robert Youssef reposted
Robert Youssef @rryssf_
Meta FAIR just solved the "cold start" problem in LLM training when a model scores 0/128 on hard math problems, standard RL training collapses. no gradient signal. no learning. nothing. their new framework SOAR escapes this trap without any human-curated data. here's how: https://t.co/9De68P5eVC
Sort: