DeepSeek-AI introduces DeepSeek-V2, a language model that reduces computational costs and improves performance. It leverages a Mixture-of-Experts architecture and Multi-head Latent Attention mechanism. DeepSeek-V2 exhibits a significant decrease in training costs, Key-Value cache size, and an increase in generation throughput. It outperforms other models in benchmark tests and sets a new standard for efficient AI models.
Sort: