Best of Computer Vision — June 2024

1
Article
Machine Learning News·2y
Meet Tsinghua University’s GLM-4-9B-Chat-1M: An Outstanding Language Model Challenging GPT 4V, Gemini Pro (on vision), Mistral and Llama 3 8B
Tsinghua University's GLM-4 9B is a powerful language model that outperforms GPT-4 and Gemini. It supports multi-round dialogue, code execution, web browsing, and more. GLM-4 9B has a versatile architecture, excels in vision tasks, and surpasses existing models in overall accuracy. It presents opportunities in natural language processing, computer vision, and code generation. The release of GLM-4 9B marks a milestone in language models and sets a new benchmark for open-source models.
15
2
Article
Planet Python·2y
Computer Vision and Data Science with Python
Kyle Stratis from Real Python discusses various topics including computer vision, data science, artificial intelligence, and Python packaging.
13
3
Article
MIT News·2y
Understanding the visual knowledge of language models
Language models trained purely on text have a solid understanding of the visual world. They can generate complex scenes and refine their images. Researchers from MIT have tested the visual knowledge of these models and trained a computer vision system without using any visual data directly. The models demonstrate creativity in drawing concepts differently each time and their visual knowledge can be combined with other AI tools for improved results.
10

See all Computer Vision archives