Occiglot is a large-scale research collective for open-source development of language models in Europe. It aims to address the lack of linguistic diversity and cultural richness in existing language models. They have introduced Model Release v0.1, focused on the five largest European languages. Occiglot leverages a novel approach with continual pre-training and instruction tuning. The performance of their language models is evaluated based on their ability to support diverse linguistic tasks. The long-term goal is to create a cohesive language modeling approach covering all official languages within the European Union. hessian.AI provides computing resources to support scalability and sustainability.

3m read timeFrom marktechpost.com
Post cover image

Sort: