Amazon SageMaker HyperPod now supports continuous provisioning for Slurm-orchestrated clusters. Previously, if any instance group failed to provision, the entire cluster creation or scaling operation rolled back. With continuous provisioning, HyperPod automatically provisions remaining capacity in the background while training

2m read timeFrom aws.amazon.com
Post cover image

Sort: