The course teaches how to train a language model (LLM) from scratch using WhatsApp or Telegram chat data. It covers the entire process from data extraction and cleaning, tokenization, and transformer architecture implementation. The course aims to help users create models that can mimic someone's unique communication style or develop models for underrepresented languages. The course is divided into two parts, starting with small datasets to understand basic concepts, and scaling up to larger datasets. The teaching includes exporting data, cleaning data, encoding text using byte pair encoding, building a transformer model, and fine-tuning the model. Resources, slides, notebooks, and code are provided for an easier understanding and application of the concepts.

3h 29m watch time

Sort: