RecurrentGemma is an open-weights Language Model by Google DeepMind based on the novel Griffin architecture. It achieves fast inference when generating long sequences by using a mixture of local attention and linear recurrences. The Flax implementation is recommended for most users. The repository includes the model
Sort: