A deep technical walkthrough of accelerating Random Forest ML inference on a GPU for Rastair, a bioinformatics variant and methylation caller. The post covers why Random Forests are used for ambiguous genomic position classification, how the tree data structure was flattened into a GPU-friendly BFS array layout with a 16-byte node struct, and how two WGSL compute shaders (traverse and reduce) handle inference. It also details multithreading with Rayon, pipelined GPU submission across three models, f32 precision verification, and a zero-copy unified memory optimization for Apple Silicon via wgpu's MAPPABLE_PRIMARY_BUFFERS feature.

11m read timeFrom deterministic.space
Post cover image
Table of contents
ContextUsing MLUsing a Compute ShaderWhat’s next

Sort: