Best of Computer VisionJune 2024

  1. 1
    Article
    Avatar of mlnewsMachine Learning News·2y

    Meet Tsinghua University’s GLM-4-9B-Chat-1M: An Outstanding Language Model Challenging GPT 4V, Gemini Pro (on vision), Mistral and Llama 3 8B

    Tsinghua University's GLM-4 9B is a powerful language model that outperforms GPT-4 and Gemini. It supports multi-round dialogue, code execution, web browsing, and more. GLM-4 9B has a versatile architecture, excels in vision tasks, and surpasses existing models in overall accuracy. It presents opportunities in natural language processing, computer vision, and code generation. The release of GLM-4 9B marks a milestone in language models and sets a new benchmark for open-source models.

  2. 2
    Article
    Avatar of planetpythonPlanet Python·2y

    Computer Vision and Data Science with Python

    Kyle Stratis from Real Python discusses various topics including computer vision, data science, artificial intelligence, and Python packaging.

  3. 3
    Article
    Avatar of mitMIT News·2y

    Understanding the visual knowledge of language models

    Language models trained purely on text have a solid understanding of the visual world. They can generate complex scenes and refine their images. Researchers from MIT have tested the visual knowledge of these models and trained a computer vision system without using any visual data directly. The models demonstrate creativity in drawing concepts differently each time and their visual knowledge can be combined with other AI tools for improved results.