DeepSeek-AI introduces DeepSeek-V2, a language model that reduces computational costs and improves performance. It leverages a Mixture-of-Experts architecture and Multi-head Latent Attention mechanism. DeepSeek-V2 exhibits a significant decrease in training costs, Key-Value cache size, and an increase in generation throughput.
Sort: