They’re the mysterious numbers that make your favorite AI models tick. What are they and what do they do?

Technology Review, published by MIT, is a leading source of news, analysis, and commentary on emerging technologies, scientific advancements, and their societal impacts. Covering topics such as artificial intelligence, biotechnology, renewable energy, and more, Technology Review provides  reporting and  articles on the intersection of technology and society. Readers can stay informed about the latest breakthroughs and trends shaping the future of technology.

MIT Technology Review

Parameters are the numerical values that control how large language models behave. They come in three types: embeddings (mathematical representations of words in high-dimensional space), weights (connection strengths between model parts), and biases (threshold adjustments). During training, billions of these parameters get updated tens of thousands of times through an iterative process that tweaks values to minimize errors. Modern LLMs like GPT-3 contain 175 billion parameters, while newer models may have trillions. The article explains how these parameters work together through layers of neurons to process text and generate responses, and why smaller models trained on more data can sometimes outperform larger ones.

LLMs contain a LOT of parameters. But what’s a parameter?

Oof. What are all these parameters for, exactly?