Sander Dieleman, research scientist at Google DeepMind, gives a behind-the-scenes overview of training large-scale generative image and video models. The talk covers eight key areas: data curation (often underrated but critical), latent representations via autoencoders to compress pixel data, the mechanics of diffusion models
•40m watch time
Sort: