Google DeepMind researchers have introduced ATLAS, a set of scaling laws for multilingual language models that formalize how model size, training data volume, and language mixtures interact as the num

InfoQ is a leading online platform for software developers, architects, and technical leaders, providing news, articles, presentations, and interviews on a wide range of topics, including agile practices, DevOps, microservices, and emerging technologies. With a focus on quality content and expert insights, InfoQ helps professionals stay informed about the latest trends, best practices, and industry developments. Developers can learn from real-world experiences, gain  knowledge, and connect with peers in the global software community through InfoQ's diverse and engaging content.

InfoQ

Google DeepMind has released ATLAS, a framework of scaling laws for multilingual language models based on 774 training runs across models from 10M to 8B parameters. ATLAS models how model size, training data volume, and language mixtures interact, introducing a cross-lingual transfer matrix that measures how training on one language affects performance in others. The research quantifies the "curse of multilinguality" and finds that doubling supported languages requires increasing model size by 1.18× and training data by 1.66× to maintain performance. The framework also provides guidance on when to pre-train from scratch versus fine-tuning existing multilingual checkpoints, with crossover points typically occurring between 144B and 283B tokens for 2B-parameter models.

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models