MobileLLM is a sub-billion parameter language model optimized for on-device use cases, presented at ICML 2024. It incorporates innovations like SwiGLU activation functions, deep and thin architectures, embedding sharing, and grouped-query attention, achieving significant accuracy improvements over state-of-the-art models. This repository includes training code and instructions for dataset preparation and multi-node setup.
Table of contents
CitationRunResults on Zero-shot Common Sense Reasoning tasksAcknowledgementContactLicenseSort: