IBM releases Granite 4.1, a family of dense decoder-only LLMs (3B, 8B, 30B) trained on ~15 trillion tokens using a five-phase pre-training pipeline. Key highlights include long-context extension up to 512K tokens, SFT on ~4.1M LLM-as-Judge-curated samples, and a four-stage RL pipeline using on-policy GRPO with DAPO loss. Notably, the 8B instruct model matches or surpasses the previous Granite 4.0-H-Small (32B-A9B MoE) model. FP8 quantized variants are available for vLLM inference. All models are released under Apache 2.0.

14m read timeFrom huggingface.co
Post cover image
Table of contents
OverviewModel ArchitecturePre-TrainingSFT: Data Preparation & Quality ControlReinforcement Learning: Multi-Stage RL PipelineResultsFP8 QuantizationInfrastructureGetting Started

Sort: