OpenAI's gpt-oss models use MXFP4, a 4-bit floating point data type that reduces inference costs by 75% compared to traditional BF16 models. MXFP4 uses micro-scaling blocks to maintain precision while dramatically cutting memory and compute requirements. This allows a 120 billion parameter model to run on 80GB VRAM instead of

6m read timeFrom go.theregister.com
Post cover image
Table of contents
What the heck is MXFP4?Why MXFP4 mattersOpenAI is setting the tone

Sort: