facebookresearch/spidr: This repository contains the training code from paper "SpidR Learning Fast and Stable Linguistic Units for Spoken Language Models Without Supervision". SpidR is a self-supervisRead post
SpidR is a self-supervised speech representation model that learns linguistic units from unlabeled audio using masked prediction, self-distillation, and online clustering. The model can be pretrained in 15-24 hours on 16 GPUs and outperforms previous methods on language modeling tasks. The repository provides pretrained
Sort: