Python's wheel format can only express OS, CPU architecture, and Python version — it has no way to declare GPU requirements or CPU instruction sets like AVX2. This forces packages like PyTorch to ship ~900MB fat binaries bundling multiple CUDA versions, and users must configure special index URLs just to install GPU-accelerated
Sort: