Best of GPU — November 2025

1
Article
InfoWorld·29w
Perplexity’s open-source tool to run trillion-parameter models without costly upgrades
Perplexity AI released TransferEngine, an open-source tool that enables trillion-parameter language models to run across different cloud providers' GPU hardware at full speed. The software solves vendor lock-in by creating a universal interface for GPU-to-GPU communication that works on both Nvidia ConnectX and AWS EFA networking protocols. This allows companies to run massive models like DeepSeek V3 and Kimi K2 on older H100 and H200 systems instead of purchasing expensive next-generation hardware. TransferEngine achieves 400 Gbps throughput using RDMA technology and is already powering Perplexity's production AI search engine, handling disaggregated inference, reinforcement learning, and Mixture-of-Experts routing.
59
2
Article
Noted·28w
The 10MB Discord Limit Drove Me to Build a Self-Hosted GPU Video Compressor
A developer built 8mb.local, a self-hosted video compression tool that solves Discord's 10MB file size limit. The single Docker container includes a SvelteKit UI, FastAPI backend, and Celery worker queue with automatic GPU detection for NVIDIA, Intel, and AMD hardware acceleration. It features target-size-first compression with automatic retry logic, real-time progress streaming via Server-Sent Events, and seamless CPU fallback. Installation requires choosing the appropriate docker-compose configuration for your hardware, with special attention to NVIDIA driver capabilities and reverse proxy buffering settings for proper SSE streaming.
56
5
3
Article
NVIDIA Developer·28w
Release v1.10.0 · NVIDIA/warp
NVIDIA Warp v1.10.0 introduces experimental JAX automatic differentiation support and multi-device compatibility with jax.pmap(). The release enhances tile programming with axis-specific reductions and component-level indexing, while delivering significant performance improvements including up to 70× faster built-in function calls from Python and in-place BVH rebuilding with CUDA graph support. New features include negative array indexing, atomic bitwise operations, and error functions. The warp.sim module has been removed after deprecation, with users directed to migrate to the Newton physics engine.
44
4
Article
Where's Your Ed At·27w
The Hater's Guide To NVIDIA
NVIDIA dominates the AI hardware market by selling increasingly expensive GPUs (from $10,000 A100s to $30,000+ B200s) that power large language models. The company's success depends on customers—primarily Microsoft, Google, Meta, and Amazon—continuously purchasing new GPU generations, often funded through massive debt. Building a small 25MW AI data center costs over $1 billion, with $600 million for GPUs alone, plus 20 acres of land and 6-18 months of construction. Despite NVIDIA's $50+ billion quarterly revenue and 8% weight in the S&P 500, the underlying economics appear unsustainable: AI companies generate only ~$61 billion in revenue annually while spending hundreds of billions on infrastructure, with no clear path to profitability.
27
5
5
Article
TechCentral·29w
China’s DeepSeek warns of social upheaval from AI
DeepSeek's senior researcher Chen Deli made a rare public appearance at China's World Internet Conference, expressing concerns about AI's long-term societal impact. While optimistic about the technology itself, Chen warned that AI could threaten widespread job displacement within 5-10 years and create massive social challenges in 10-20 years. DeepSeek gained global attention in January for releasing a low-cost AI model that outperformed leading US models. The company recently upgraded its V3 model in September and has become central to China's efforts to build a domestic AI ecosystem, with Chinese chip makers like Cambricon and Huawei developing hardware compatible with DeepSeek's models.
15

See all GPU archives