Dig deeper here: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
Technical details as a talk: https://youtu.be/KJtZARuO3JY
Made for an exhibit at the Computer History Museum: https://computerhistory.org/
Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

Timestamps:
0:00 - Who this was made for
0:41 - What are large language models?
7:48 - Where to learn more

No secret end-screen vlog for this one, the end-screen real estate was all full!

------------------

These animations are largely made using a custom Python library, manim.  See the FAQ comments here:
https://3b1b.co/faq#manim
https://github.com/3b1b/manim
https://github.com/ManimCommunity/manim/

All code for specific videos is visible here:
https://github.com/3b1b/videos/

The music is by Vincent Rubinetti.
https://www.vincentrubinetti.com
https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown
https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u

------------------

3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.

Mailing list: https://3blue1brown.substack.com
Twitter: https://twitter.com/3blue1brown
Instagram: https://www.instagram.com/3blue1brown
Reddit: https://www.reddit.com/r/3blue1brown
Facebook: https://www.facebook.com/3blue1brown
Patreon: https://patreon.com/3blue1brown
Website: https://www.3blue1brown.com

3Blue1Brown

The post explains large language models (LLMs), how they function, and the complexities behind their training. LLMs predict the next word in a sequence based on probabilities, using vast amounts of text data for training. The introduction of transformers in 2017 allowed for parallel processing of text, enhancing computation efficiency. Pre-training is supplemented by reinforcement learning with human feedback to refine model predictions. The sheer scale of data and computation involved is formidable, taking advantage of specialized hardware like GPUs.

Large Language Models explained briefly

<p>Phenomenal animations and very clear explanations of a complex topic. I wish I had that kind of learning materials as a teen</p>