A visual, interactive walkthrough of Andrej Karpathy's 200-line Python GPT implementation trained on 32,000 human names. Covers every core concept step by step: character-level tokenization, the sliding-window prediction task, softmax and cross-entropy loss, backpropagation through a scalar computation graph, token and

10m read time From growingswe.com
Post cover image
Table of contents
The datasetNumbers, not lettersThe prediction gameFrom scores to probabilitiesMeasuring surpriseTracking every calculationFrom IDs to meaningHow tokens talk to each otherThe full pictureLearningMaking things upEverything else is efficiency

Sort: