Databricks announces the official integration of MegaBlocks, an open-source library used to train DBRX, into their training stack LLMFoundry. MegaBlocks is an efficient implementation of Mixture-of-Experts (MoE) models that improves training and inference efficiency. Databricks also offers optimized training methods for large models, including custom kernels and linear scaling.
Table of contents
What is a Mixture of Experts Model?Taking Ownership of MegaBlocksOpen Sourcing LLMFoundry IntegrationOptimized Training at DatabricksSort: