Olmo 3 is Allen AI's fully open-source large language model available in 7B and 32B parameter versions. The release includes complete access to models, training datasets (Dolma 3 with 9.3 trillion tokens), code, and training logs. The model uses a three-stage training pipeline: pretraining on Dolma 3 Mix, mid-training on Dolma
Table of contents
IntroductionPrerequisitesKey TakeawaysModel ArchitectureData CurationOlmoTraceOlmo3 on DigitalOceanReferences and Additional ResourcesFAQFinal Thoughts1 Comment
Sort: