CompreSSM, an algorithm from MIT CSAIL, trims dead weight from AI models, shedding unnecessary complexity while also making them faster as they continue to learn. It helps the model find its own efficient structure while cutting compute costs.

MIT is a renowned institution for education and research, offering insights into science, engineering, and technology. Through publications, research papers, and academic programs, MIT's platform provides insights into  research, innovation, and education in various fields. Students, researchers, and technology enthusiasts can learn about MIT's contributions to science and technology and explore opportunities for academic and professional development.

MIT News

MIT CSAIL researchers developed CompreSSM, a technique that compresses AI state-space models during training rather than after. Using Hankel singular values from control theory, the method identifies unimportant model dimensions after just 10% of training and removes them, allowing the remaining 90% to run at the speed of a smaller model. On Mamba, it achieves ~4x training speedups while maintaining competitive accuracy. CompreSSM outperforms both post-training pruning and knowledge distillation, being 40x faster than spectral regularization alternatives. The approach is theoretically grounded via Weyl's theorem and targets multi-input, multi-output state-space architectures, with planned extensions toward linear attention and transformer-adjacent models.

New technique makes AI models leaner and faster while they’re still learning