Pinterest's ML platform team shares a detailed account of upgrading PyTorch from 2.1 to 2.6 in production with zero downtime. The post covers the full cross-stack effort: migrating GPU hosts from Ubuntu 20 to Ubuntu 24 DLAMI with CUDA 12.6 and Nvidia driver 570, bridging breaking LibTorch C++ API changes using a compile-time
Table of contents
IntroductionChallengesJourney to PyTorch 2.6Get Pinterest Engineering’s stories in your inboxBridging Breaking APIs and Deprecated Caffe2Production AftercareUncovering a Cgroup Driver GotchaWrap UpAcknowledgementSort: