Amazon SageMaker HyperPod now supports continuous provisioning for Slurm-orchestrated clusters. Previously, if any instance group failed to provision, the entire cluster creation or scaling operation rolled back. With continuous provisioning, HyperPod automatically provisions remaining capacity in the background while training
Sort: