Databricks announces the official integration of MegaBlocks, an open-source library used to train DBRX, into their training stack LLMFoundry. MegaBlocks is an efficient implementation of Mixture-of-Experts (MoE) models that improves training and inference efficiency. Databricks also offers optimized training methods for large models, including custom kernels and linear scaling.

3m read timeFrom databricks.com
Post cover image
Table of contents
What is a Mixture of Experts Model?Taking Ownership of MegaBlocksOpen Sourcing LLMFoundry IntegrationOptimized Training at Databricks

Sort: