The post discusses the application of block sparsity on MLP modules in vision transformers, showing promising speedups with minimal accuracy drop. It explains the training and inference steps, provides microbenchmarking results, and showcases the speedup and accuracy achieved on a specific ViT model. Future steps and potential
•6m read time• From pytorch.org
Sort: