Researchers from China Propose Vision Mamba (Vim): A New Generic Vision Backbone With Bidirectional Mamba Blocks

We are a community of AI/ ML/Generative AI enthusiasts/researchers/journalists/writers who share interesting news and articles about the applications of AI. 

Machine Learning News

Researchers propose Vision Mamba (Vim), a new generic vision backbone with bidirectional Mamba blocks. Vim combines position embeddings for location-aware visual identification with bidirectional SSMs for data-dependent global visual context modeling. It achieves the same modeling power as ViT without requiring attention and outperforms the DeiT model in terms of performance.