NVIDIA Warp is a Python framework for GPU-accelerated simulation that JIT-compiles Python functions into CUDA kernels. This deep-dive walks through building a 2D Navier-Stokes solver using Warp's SIMT kernel model and FFT-based Poisson solver with tile primitives. It then demonstrates differentiating through the solver using Warp's reverse-mode automatic differentiation (wp.Tape) to solve an optimal perturbation problem in turbulent flow. Key constraints for differentiable solvers include avoiding in-place array modifications and pre-allocating intermediate arrays per timestep, with gradient checkpointing available to manage memory. Industrial case studies show Warp delivering 8x throughput over JAX for Autodesk's CFD solver, up to 475x speedup over MuJoCo MJX for Google DeepMind's multibody dynamics, and 669x speedup over CPU baselines for C-Infinity's CAD spatial intelligence engine.

15m read timeFrom developer.nvidia.com
Post cover image
Table of contents
How to write a 2D Navier – Stokes solver using WarpDifferentiating through the solverWarp in practice: Case studies of AI-driven industrial workflowsGet started with Warp for computational physics applications

Sort: