2x-3x reduction in cost and better performance.

Daily Dose of DS offers a daily dose of inspiration, education, and motivation for data scientists and aspiring data professionals. Through bite-sized articles, tutorials, and curated resources, readers embark on a journey to master the art and science of data analysis, machine learning, and artificial intelligence. By staying updated with the latest trends, techniques, and tools in data science, readers can hone their skills and stay ahead in this rapidly evolving field.

Daily Dose of Data Science | Avi Chawla | Substack

DeepSeek V3.2 introduces DeepSeek Sparse Attention (DSA), reducing attention complexity from O(L²) to O(Lk) by using a Lightning Indexer to select only the top 2,048 relevant tokens per query, regardless of context length. This achieves 2-3x cost reduction for long-context inference (128K tokens) while maintaining or improving performance. The breakthrough demonstrates how architectural innovations can scale context windows without proportionally scaling compute costs, addressing the limitations of simply adding more data and GPUs.

DeepSeek Cracked The O(L²) Attention Bottleneck

Opik: Open-source LLM evaluation platform

DeepSeek cracked the O(L²) attention bottleneck

P.S. For those wanting to develop “Industry ML” expertise: