Making SLH-DSA 10x-100x Faster
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
A deep-dive into optimizing SLH-DSA (formerly SPHINCS+), the NIST post-quantum signature algorithm, achieving 10x-100x speedups over naive implementations. The author benchmarks multiple optimization strategies: SHA256 midstate caching (2x speedup), XMSS root tree caching (~14% reduction), SHA-NI hardware acceleration (2.5x), AVX2 SIMD vectorization (4x for signing), CPU multithreading (~40% reduction), and finally Vulkan compute shaders running on CPU and GPU. The Vulkan GPU approach achieves keygen in ~0.16ms and signing in ~2.6ms, approaching ECDSA verification latency. A custom Rust+C implementation called slhvk is released publicly, supporting bulk keygen and bulk verification for maximum throughput.
Table of contents
SLH-DSA DesignParameter SetsExperimental ImplementationPerformance ProfileMidstate CachingXMSS Tree CachingImpactHardware AccelerationImpactVectorized HashingImpactMultithreadingVulkan Is AwfulVulkan for SLH-DSAslhvkGPUsCaveatsVisualizationRelated WorksFurther ResearchSort: