Trip Venturella released Mr. Chatterbox, a 340M-parameter language model trained from scratch on 28,035 Victorian-era British Library texts (2.93B tokens, all pre-1900). Simon Willison explores the model, noting it performs poorly — more like a Markov chain than a modern LLM — likely because the training data is less than half
Sort: