NVIDIA introduces NVFP4, a new 4-bit floating point format for Blackwell GPUs that achieves ultra-low precision inference while maintaining model accuracy. NVFP4 uses innovative micro-block scaling with E4M3 precision and reduces memory footprint by 3.5x compared to FP16 and 1.8x compared to FP8. The format delivers up to 50x

10m read timeFrom developer.nvidia.com
Post cover image
Table of contents
What is NVFP4?High-precision scaling: Encoding more signal, less errorMicro-block scaling for efficient model compressionNVFP4 versus FP8: Model performance and memory efficiencyFP4 energy efficiencyGet started with NVFP4

Sort: