I'm Sebastian: a machine learning & AI researcher, programmer, and author. As Staff Research Engineer Lightning AI, I focus on the intersection of AI research, software development, and large language models (LLMs).

Sebastian Raschka's Blog offers insights, tutorials, and research updates on machine learning, deep learning, and artificial intelligence. Covering topics such as neural networks, data science, and Python programming, Sebastian Raschka's Blog provides resources for students, researchers, and practitioners in the field of AI. Developers can learn about  algorithms, research methodologies, and practical applications of machine learning through Raschka's blog posts and publications.

Sebastian Raschka

The development of modern Large Language Models (LLMs) has evolved from pre-training to include both pre-training and post-training methodologies. Four new state-of-the-art models—Alibaba’s Qwen 2, Apple’s Apple Intelligence Foundation Models, Google’s Gemma 2, and Meta AI’s Llama 3.1—exemplify various approaches to these training paradigms. They employ multi-stage pre-training methods and unique post-training strategies like supervised instruction fine-tuning, Direct Preference Optimization (DPO), and reinforcement learning with human feedback (RLHF). Each model highlights different techniques and focuses on data quality and synthetic data utilization to enhance performance.