If you’ve played around with recent models on HuggingFace, chances are you encountered a causal language model. When you pull up the documentation for a model family, you’ll get a page with “tasks”…

Towards Data Science is a community-powered publication that showcases work in data science, machine learning and artificial intelligence. Every day newcomers, seasoned researchers and industry practitioners publish tutorials, research notes and real-world case studies that help the field move forward.

Towards Data Science

This post explains what CausalLM is and how to train a CausalLM model using HuggingFace. It discusses the difference between encoder-only and decoder-only models and provides a worked example of the training process.

Training CausalLM Models Part 1: What Actually Is CausalLM?

The first part of a practical guide to using HuggingFace’s CausalLM class