A research paper introduces Simple Self-Distillation (SSD), a technique that improves LLM code generation without requiring a verifier, teacher model, or reinforcement learning. By sampling solutions at specific temperature and truncation settings and fine-tuning on those samples via standard supervised fine-tuning, SSD boosts
Sort: