Best of GPUMarch 2026

  1. 1
    Article
    Avatar of tcTechCrunch·12w

    Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers

    Nvidia CEO Jensen Huang stated at the Morgan Stanley TMT conference that his company's investments in OpenAI and Anthropic will likely be its last, citing IPO windows closing as the reason. However, the explanation is questioned given several complicating factors: circular investment logic (Nvidia investing in companies that buy its chips), Anthropic CEO Dario Amodei's public criticism of Nvidia's chip export practices, the Trump administration blacklisting Anthropic from federal use, and OpenAI's subsequent Pentagon deal. Nvidia's original $100 billion OpenAI pledge was ultimately reduced to $30 billion. The piece suggests Nvidia may be quietly exiting a politically and commercially entangled situation rather than simply following investment strategy.

  2. 2
    Video
    Avatar of devops-toolkitDevOps Toolkit·12w

    Why Self-Hosting AI Models Is a Bad Idea

    A cost analysis arguing against self-hosting large language models. Running frontier open-weight models like Kimi K2.5 requires 4-16 Nvidia H100 GPUs, costing $8,000-$35,000/month in cloud rentals or $150,000-$300,000+ in the first year for owned hardware. By contrast, API access to the same models costs $300-$800/month — 10 to 30 times cheaper. Even smaller models on consumer hardware take years to recoup API savings. The piece also warns that 'open weight' is not 'open source': licenses like Llama's have real restrictions and can change at any time. The recommendation is to use cheap vendor APIs while AI companies are subsidizing costs with VC and government money, avoid lock-in by staying provider-agnostic, and only consider self-hosting in special cases like air-gapped environments or massive existing GPU infrastructure.

  3. 3
    Article
    Avatar of skypilotSkyPilot·10w

    Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

    Karpathy's autoresearch project lets a coding agent autonomously improve a neural network training script by running experiments in a loop. This post scales that setup by giving Claude Code access to 16 GPUs (H100s and H200s) on a Kubernetes cluster via SkyPilot. Over 8 hours, the agent ran ~910 experiments in parallel waves of 10-13, achieving a 9x throughput increase over single-GPU sequential search. Key findings: parallelism enabled factorial grid search instead of greedy hill-climbing, allowing the agent to discover that scaling model width (aspect ratio 96) outperformed all hyperparameter tuning combined. The agent also autonomously developed a two-tier hardware strategy — screening ideas on cheaper H100s and validating winners on H200s — without being prompted. Total cost was under $300 in GPU compute plus ~$9 in Claude API fees. The full setup is available as an open-source example in the SkyPilot repo.

  4. 4
    Article
    Avatar of freecodecampfreeCodeCamp·12w

    How to Use Docker Compose for Production Workloads — with Profiles, Watch Mode, and GPU Support

    Docker Compose has evolved significantly in 2024-2025 with features that make it viable for complex deployment scenarios beyond local development. Key improvements covered include: profiles for managing multiple environments from a single file, watch mode for instant file syncing without rebuilds, GPU passthrough for ML inference workloads, proper health checks with dependency conditions to eliminate startup race conditions, and Docker Bake integration for production image builds. The guide provides practical configuration examples for each feature, a week-by-week adoption path, and an honest assessment of where Compose still falls short compared to Kubernetes or full orchestration platforms.