SAM 3 just dropped, and it's a big deal
Meta released SAM 3, an open-source computer vision model that enables text-based object segmentation in images and videos. The model supports multiple input methods including text prompts, clicks, and bounding boxes, and can track objects across video frames. Trained on over 4 million unique concepts, it reportedly delivers double the accuracy of competing systems on open-vocabulary segmentation tasks. The model is available on GitHub with weights and starter notebooks.