ScreenAI is a vision-language model for user interfaces and infographics that achieves state-of-the-art results on UI and infographics-based tasks. It improves upon the PaLI architecture and is trained on a mixture of datasets. Three new datasets, including Screen Annotation, are also being released.

1m read timeFrom research.google
Post cover image

Sort: