Karpathy's autoresearch project lets a coding agent autonomously improve a neural network training script by running experiments in a loop. This post scales that setup by giving Claude Code access to 16 GPUs (H100s and H200s) on a Kubernetes cluster via SkyPilot. Over 8 hours, the agent ran ~910 experiments in parallel waves of
Table of contents
How autoresearch works #The bottleneck: one GPU, one experiment #Giving the agent cloud GPUs #Results: ~910 experiments, ~8 hours, 16 GPUs #How parallelism changed the agent’s research strategy #Emergent research strategies: exploiting heterogeneous hardware #Cost #Scale Autoresearch on your own GPU cluster #2 Comments
Sort: