Arcee AI released Trinity Large, a 400B parameter sparse Mixture-of-Experts model with 13B active parameters per token using 256 experts (4 active per token). The model was trained on 17T tokens using 2048 Nvidia B300 GPUs in 33 days for $20M total cost. Three variants are available: Preview (lightly post-trained, chat-ready),
•8m read time• From arcee.ai
Sort: