Best of GPUNovember 2025

  1. 1
    Article
    Avatar of infoworldInfoWorld·29w

    Perplexity’s open-source tool to run trillion-parameter models without costly upgrades

    Perplexity AI released TransferEngine, an open-source tool that enables trillion-parameter language models to run across different cloud providers' GPU hardware at full speed. The software solves vendor lock-in by creating a universal interface for GPU-to-GPU communication that works on both Nvidia ConnectX and AWS EFA networking protocols. This allows companies to run massive models like DeepSeek V3 and Kimi K2 on older H100 and H200 systems instead of purchasing expensive next-generation hardware. TransferEngine achieves 400 Gbps throughput using RDMA technology and is already powering Perplexity's production AI search engine, handling disaggregated inference, reinforcement learning, and Mixture-of-Experts routing.

  2. 2
    Article
    Avatar of notedNoted·28w

    The 10MB Discord Limit Drove Me to Build a Self-Hosted GPU Video Compressor

    A developer built 8mb.local, a self-hosted video compression tool that solves Discord's 10MB file size limit. The single Docker container includes a SvelteKit UI, FastAPI backend, and Celery worker queue with automatic GPU detection for NVIDIA, Intel, and AMD hardware acceleration. It features target-size-first compression with automatic retry logic, real-time progress streaming via Server-Sent Events, and seamless CPU fallback. Installation requires choosing the appropriate docker-compose configuration for your hardware, with special attention to NVIDIA driver capabilities and reverse proxy buffering settings for proper SSE streaming.

  3. 3
    Article
    Avatar of nvidiadevNVIDIA Developer·28w

    Release v1.10.0 · NVIDIA/warp

    NVIDIA Warp v1.10.0 introduces experimental JAX automatic differentiation support and multi-device compatibility with jax.pmap(). The release enhances tile programming with axis-specific reductions and component-level indexing, while delivering significant performance improvements including up to 70× faster built-in function calls from Python and in-place BVH rebuilding with CUDA graph support. New features include negative array indexing, atomic bitwise operations, and error functions. The warp.sim module has been removed after deprecation, with users directed to migrate to the Newton physics engine.

  4. 4
    Article
    Avatar of wheresyouredWhere's Your Ed At·27w

    The Hater's Guide To NVIDIA

    NVIDIA dominates the AI hardware market by selling increasingly expensive GPUs (from $10,000 A100s to $30,000+ B200s) that power large language models. The company's success depends on customers—primarily Microsoft, Google, Meta, and Amazon—continuously purchasing new GPU generations, often funded through massive debt. Building a small 25MW AI data center costs over $1 billion, with $600 million for GPUs alone, plus 20 acres of land and 6-18 months of construction. Despite NVIDIA's $50+ billion quarterly revenue and 8% weight in the S&P 500, the underlying economics appear unsustainable: AI companies generate only ~$61 billion in revenue annually while spending hundreds of billions on infrastructure, with no clear path to profitability.

  5. 5
    Article
    Avatar of techcentralTechCentral·29w

    China’s DeepSeek warns of social upheaval from AI

    DeepSeek's senior researcher Chen Deli made a rare public appearance at China's World Internet Conference, expressing concerns about AI's long-term societal impact. While optimistic about the technology itself, Chen warned that AI could threaten widespread job displacement within 5-10 years and create massive social challenges in 10-20 years. DeepSeek gained global attention in January for releasing a low-cost AI model that outperformed leading US models. The company recently upgraded its V3 model in September and has become central to China's efforts to build a domestic AI ecosystem, with Chinese chip makers like Cambricon and Huawei developing hardware compatible with DeepSeek's models.