ScreenAI is a vision-language model for user interfaces and infographics that achieves state-of-the-art results on UI and infographics-based tasks. It improves upon the PaLI architecture and is trained on a mixture of datasets. Three new datasets, including Screen Annotation, are also being released.
Sort: