John Schulman, co-founder of OpenAI and now at Thinking Machines, reflects on early OpenAI's ragtag origins, failed projects like Universe, and what it would have taken to build ChatGPT earlier with full hindsight. He discusses why value functions are currently unpopular in RL, the future of continual learning, co-training generators and verifiers, and multi-agent game-based training. He shares his personal AI workflow using Cursor, Claude Code, and GPT-5 Pro for literature search and idea iteration. He also introduces Tinker, a low-level fine-tuning API from Thinking Machines aimed at ML researchers who want to run post-training algorithms without managing GPU infrastructure. The conversation covers research management styles, how the field's talent distribution has shifted toward engineering over research taste, AGI timeline uncertainty, and the challenges of coordinating between major AI labs.

51m watch time

Sort: