A minimal, dependency-free Python implementation of GPT training and inference in a single file. The code demonstrates the complete algorithm from scratch, including custom autograd, tokenization, multi-head attention, and Adam optimizer. It trains on a names dataset and generates new samples, serving as an educational
Sort: