LLMs contain a LOT of parameters. But what’s a parameter?

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

Parameters are the numerical values that control how large language models behave. They come in three types: embeddings (mathematical representations of words in high-dimensional space), weights (connection strengths between model parts), and biases (threshold adjustments). During training, billions of these parameters get updated tens of thousands of times through an iterative process that tweaks values to minimize errors. Modern LLMs like GPT-3 contain 175 billion parameters, while newer models may have trillions. The article explains how these parameters work together through layers of neurons to process text and generate responses, and why smaller models trained on more data can sometimes outperform larger ones.

13m read timeFrom technologyreview.com
Post cover image
Table of contents
What is a parameter?How are they assigned their values?Sounds straightforward …Oof. What are all these parameters for, exactly?Okay! So, what are embeddings?

Sort: