Best of GPUFebruary 2026

  1. 1
    Article
    Avatar of helixmlHelixML·15w

    GPU Virtualization Architecture for Multi-Desktop Containers

    Deep technical dive into building GPU-accelerated multi-desktop virtualization on Apple Silicon. Covers the full stack from virtio-gpu driver through QEMU to Metal, focusing on deadlock bugs that emerge when scaling from 1-2 to 4+ concurrent desktops. Key issues include global renderer_blocked semaphore causing cross-scanout freezes, FIFO command queue blocking, broken fence polling timers, and DRM mode_config.mutex contention. Solutions involve per-context isolation, thread-based fence polling workarounds, and removing synchronous operations from critical paths.

  2. 2
    Article
    Avatar of chromeChrome Developers·15w

    What's New in WebGPU (Chrome 145)

    Chrome 145 introduces the WGSL subgroup_uniformity extension, which improves uniformity analysis at the subgroup level, allowing more values to be considered subgroup-uniform. An experimental synchronous buffer mapping feature (mapSync()) is now available in workers to reduce friction between WebGPU and application code. Dawn updates include feature replacements (TextureFormatTier1 superseding R8UnormStorage), nightly binary releases on GitHub, and ExternalTexture support in Emdawnwebgpu.

  3. 3
    Video
    Avatar of twoninutepapersTwo Minute Papers·15w

    This Is Now 66x Faster

    A new cloth physics simulation algorithm achieves 66x speedup over previous methods by using domain decomposition on CPUs instead of traditional GPU parallelization. The technique splits complex simulations into smaller chunks solved independently by CPU cores, then stitches boundaries together—avoiding the iterative communication overhead of GPU approaches. This CPU-based method handles millions of degrees of freedom and complex self-collisions while outperforming even GPU implementations by 2.6x.