University of Maryland and Meta AI propose a method to map text to concept vectors using off-the-shelf vision encoders trained without text supervision. This method adjusts a vision model’s representation space to coincide with a CLIP model's. The method learns a mapping between representation spaces to use this capacity for commercially available models.

4m read timeFrom marktechpost.com
Post cover image

Sort: