Abstraction is one of the most critical concepts in Computer Science, with some of the most powerful implications. From a simplistic point of view, abstraction is the ability to take something and…

Towards Data Science is a community-powered publication that showcases work in data science, machine learning and artificial intelligence. Every day newcomers, seasoned researchers and industry practitioners publish tutorials, research notes and real-world case studies that help the field move forward.

Towards Data Science

This blog post discusses the architecture and findings of Apple's MM1 paper on Multimodal Large Language Models. It explores the abstraction of input for Large Language Models, the image encoders, vision-language connectors, and the results of different ablations and pre-training data. The post highlights the impact of image resolution on performance and the potential applications of multi-modal LLMs.

Multimodal Large Language Models & Apple’s MM1