NVIDIA CompileIQ is an AI-powered compiler auto-tuning framework released with CUDA 13.3 that uses evolutionary and genetic algorithms to find optimal compiler configurations for specific GPU workloads. Instead of relying on generic default heuristics, CompileIQ explores internal compiler parameters (register allocation, instruction scheduling, loop transformations) and produces an Advanced Controls File (ACF) that can yield up to 15% performance gains on already-optimized kernels. It supports multi-objective optimization across runtime, compile time, and power consumption, computing a Pareto frontier of trade-offs. The Python package is installable via pip, requires users to define a benchmark objective function, and works with both PTXAS and NVCC compilers. ACFs are reproducible, portable, and safe to commit to version control.

12m read timeFrom developer.nvidia.com
Post cover image
Table of contents
The 90% problem and the opportunityIntroducing CompileIQGetting started in 4 stepsExamplesMulti-objective optimization and IP protectionResults and production adoptionYour turn

Sort: