Best of GPU — March 2026

1
Article
TechCrunch·12w
Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers
Nvidia CEO Jensen Huang stated at the Morgan Stanley TMT conference that his company's investments in OpenAI and Anthropic will likely be its last, citing IPO windows closing as the reason. However, the explanation is questioned given several complicating factors: circular investment logic (Nvidia investing in companies that buy its chips), Anthropic CEO Dario Amodei's public criticism of Nvidia's chip export practices, the Trump administration blacklisting Anthropic from federal use, and OpenAI's subsequent Pentagon deal. Nvidia's original $100 billion OpenAI pledge was ultimately reduced to $30 billion. The piece suggests Nvidia may be quietly exiting a politically and commercially entangled situation rather than simply following investment strategy.
84
13
2
Video
DevOps Toolkit·12w
Why Self-Hosting AI Models Is a Bad Idea
A cost analysis arguing against self-hosting large language models. Running frontier open-weight models like Kimi K2.5 requires 4-16 Nvidia H100 GPUs, costing $8,000-$35,000/month in cloud rentals or $150,000-$300,000+ in the first year for owned hardware. By contrast, API access to the same models costs $300-$800/month — 10 to 30 times cheaper. Even smaller models on consumer hardware take years to recoup API savings. The piece also warns that 'open weight' is not 'open source': licenses like Llama's have real restrictions and can change at any time. The recommendation is to use cheap vendor APIs while AI companies are subsidizing costs with VC and government money, avoid lock-in by staying provider-agnostic, and only consider self-hosting in special cases like air-gapped environments or massive existing GPU infrastructure.
29
5
3
Article
SkyPilot·10w
Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster
Karpathy's autoresearch project lets a coding agent autonomously improve a neural network training script by running experiments in a loop. This post scales that setup by giving Claude Code access to 16 GPUs (H100s and H200s) on a Kubernetes cluster via SkyPilot. Over 8 hours, the agent ran ~910 experiments in parallel waves of 10-13, achieving a 9x throughput increase over single-GPU sequential search. Key findings: parallelism enabled factorial grid search instead of greedy hill-climbing, allowing the agent to discover that scaling model width (aspect ratio 96) outperformed all hyperparameter tuning combined. The agent also autonomously developed a two-tier hardware strategy — screening ideas on cheaper H100s and validating winners on H200s — without being prompted. Total cost was under $300 in GPU compute plus ~$9 in Claude API fees. The full setup is available as an open-source example in the SkyPilot repo.
26
2
4
Article
freeCodeCamp·12w
How to Use Docker Compose for Production Workloads — with Profiles, Watch Mode, and GPU Support
Docker Compose has evolved significantly in 2024-2025 with features that make it viable for complex deployment scenarios beyond local development. Key improvements covered include: profiles for managing multiple environments from a single file, watch mode for instant file syncing without rebuilds, GPU passthrough for ML inference workloads, proper health checks with dependency conditions to eliminate startup race conditions, and Docker Bake integration for production image builds. The guide provides practical configuration examples for each feature, a week-by-week adoption path, and an honest assessment of where Compose still falls short compared to Kubernetes or full orchestration platforms.
18

See all GPU archives