A conference talk covering key AI trends from a historical perspective (2012 ImageNet, 2017 Transformers, ChatGPT) through to current and emerging developments. Topics include scaling laws (pre-training, post-training, and reasoning/task scaling), agentic AI, multimodal generative AI, and physical AI for robotics. The speaker also covers model optimization techniques including distillation, speculative decoding, quantization, and disaggregated serving (prefill vs decode separation). NVIDIA tools like NeMo Tron, Cosmos models, TensorRT LLM, and Dynamo are highlighted as open-source implementations of these concepts.
•43m watch time
Sort: