Encounter the libtorch_cuda.so: undefined symbol: ncclRedOpDestroy error? Follow these steps to resolve it: 1. Build the newest nccl by cloning the repository and using make. 2. Link the nccl libraries in site-packages/nvidia by creating a soft link to the nccl/build folder.
Sort: