This post discusses bigram language modeling, its implementation, and its comparison to a neural network model. It also provides an overview of the training loop and inference process.
Sort: