The development of modern Large Language Models (LLMs) has evolved from pre-training to include both pre-training and post-training methodologies. Four new state-of-the-art models—Alibaba’s Qwen 2, Apple’s Apple Intelligence Foundation Models, Google’s Gemma 2, and Meta AI’s Llama 3.1—exemplify various approaches to these training paradigms. They employ multi-stage pre-training methods and unique post-training strategies like supervised instruction fine-tuning, Direct Preference Optimization (DPO), and reinforcement learning with human feedback (RLHF). Each model highlights different techniques and focuses on data quality and synthetic data utilization to enhance performance.

22m read timeFrom sebastianraschka.com
Post cover image
Table of contents
1.1 Qwen 2 Overview1.2 Qwen 2 Pre-training1.3 Qwen 2 Post-training1.4 Conclusion2.1 AFM Overview2.2 AFM Pre-training2.3 AFM Post-training2.4 Conclusion3.1 Gemma 2 Overview3.2 Gemma 2 Pre-training3.3 Gemma 2 Post-training3.4 Conclusion4.1 Llama 3.1 Overview4.2 Llama 3.1 Pre-training4.3 Llama 3.1 Post-training4.4 Conclusion

Sort: