Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation

We are a community of AI/ ML/Generative AI enthusiasts/researchers/journalists/writers who share interesting news and articles about the applications of AI. 

Machine Learning News

Salesforce Research, alongside academic partners, has introduced BLIP3-o, an open-source multimodal model combining CLIP embeddings with Flow Matching for advanced image understanding and generation. This model utilizes a sequential training approach to separate tasks, enhancing performance. Two versions, with different parameter sizes and data sources, demonstrate superior results in benchmarks against existing models, showcasing efficient multimodal capabilities.