PewDiePie documents his months-long journey fine-tuning the Qwen 32B coding model from scratch with no prior ML knowledge. He covers data collection (mining GitHub MIT-licensed repos, using OSS-Instruct and Evol-Instruct for synthetic data generation), dealing with data contamination, hardware failures from overloaded home GPU rigs, and iterative benchmark failures on the Aider Polyglot benchmark. After numerous setbacks including training on the wrong model version and benchmark contamination issues, he ultimately achieves a 39.1% score on the benchmark, surpassing ChatGPT-4o and Gemini Pro. The video is a reaction/commentary format with streamers reacting to PewDiePie's original video.

43m watch time

Sort: