pip install vllm: The iceberg under a single command

Running `pip install vllm` looks simple, but building vLLM to work across multiple hardware accelerators (NVIDIA, AMD, Intel Gaudi, Google TPU) involves enormous build engineering complexity. The post details the challenges: HIPification of CUDA kernels for ROCm, tight version coupling between PyTorch/Triton/AOTriton, custom per-package build hooks, and a version matrix that must be manually maintained. Red Hat AI uses Fromager, an open source tool for rebuilding complete Python wheel dependency trees from source, to ensure reproducibility, ABI compatibility, license compliance, and security (SBOM) across all hardware targets. A real-world example is shared where an xFormers build silently skipped HIP compilation, only crashing weeks later on MI300X hardware due to a missing environment variable.

#python

#pytorch

#vllm

#build-systems

#rocm

Apr 16•7m read time•From developers.redhat.com

Table of contents

The current landscape The challenge Deep dive: Building for ROCm How we solve it Why all this matters What's next