Walk through Karpathy's 200-line GPT from scratch. Tokenize names into integers, watch softmax convert scores to probabilities, step through backpropagation on a computation graph, explore attention heatmaps, and see a tiny model learn to generate plausible names.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

A visual, interactive walkthrough of Andrej Karpathy's 200-line Python GPT implementation trained on 32,000 human names. Covers every core concept step by step: character-level tokenization, the sliding-window prediction task, softmax and cross-entropy loss, backpropagation through a scalar computation graph, token and positional embeddings, multi-head causal self-attention with heatmaps, the full transformer pipeline (RMSNorm, MLP, residual connections), the Adam optimizer, and temperature-controlled inference. Concludes by mapping each concept to its counterpart in production LLMs like ChatGPT.