A new technique transforms any computer vision model into one that can explain its predictions using a set of concepts a human could understand. The method generates more appropriate concepts that boost the accuracy of the model.

MIT is a renowned institution for education and research, offering insights into science, engineering, and technology. Through publications, research papers, and academic programs, MIT's platform provides insights into  research, innovation, and education in various fields. Students, researchers, and technology enthusiasts can learn about MIT's contributions to science and technology and explore opportunities for academic and professional development.

MIT News

MIT researchers developed a new method to improve concept bottleneck models (CBMs) for AI explainability in computer vision. Instead of relying on human-defined or LLM-generated concepts, the technique uses a sparse autoencoder to extract concepts the model already learned during training, then translates them into plain language using a multimodal LLM. The approach converts any pretrained computer vision model into one that explains its predictions using up to five human-understandable concepts. Tested on bird species classification and skin lesion identification, the method outperformed existing CBMs in both accuracy and explanation quality, though a gap remains compared to non-interpretable black-box models.

Improving AI models’ ability to explain their predictions