Best of MLOps — January 2026

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·16w
Phases of ML Modeling
ML systems should evolve through four distinct phases rather than jumping straight to complex models. Start with simple heuristics and rules (Phase 1), then move to basic ML models like logistic regression (Phase 2), optimize through feature engineering and hyperparameter tuning (Phase 3), and only adopt complex models like deep neural networks when simpler approaches are exhausted (Phase 4). This staged approach reduces risk, improves debuggability, and ensures each phase's best model becomes the baseline for the next, encouraging incremental progress and evidence-driven decision-making.
91
1
2
Article
roadmap.sh·16w
MLOps Roadmap has been updated!
The roadmap.sh MLOps roadmap has been updated for 2026, providing a step-by-step guide for learning and mastering MLOps practices. The updated resource offers a structured learning path for those looking to develop skills in machine learning operations.
61
3
Article
CNCF·16w
Introducing Kthena: LLM inference for the cloud native era
Kthena is a new open-source sub-project of Volcano designed for LLM inference orchestration on Kubernetes. It addresses production challenges like low GPU/NPU utilization, latency-throughput tradeoffs, and multi-model management through intelligent routing, KV Cache-aware scheduling, and Prefill-Decode disaggregation. The system includes a high-performance router and controller manager that support topology-aware scheduling, gang scheduling, autoscaling, and multiple inference engines (vLLM, SGLang, Triton). Benchmarks show 2.73x throughput improvement and 73.5% TTFT reduction compared to random routing. Backed by Huawei Cloud, China Telecom, DaoCloud, and other industry partners.
30
4
Article
Daily Dose of Data Science | Avi Chawla | Substack·18w
Foundations of AI Engineering and LLMOps
Part 3 of an LLMOps course is now available, covering attention mechanisms, transformer architectures, mixture-of-experts, and the fundamentals of pretraining and fine-tuning with hands-on code demos. LLMOps extends traditional MLOps principles to address the unique engineering challenges of managing large language models like Llama, GPT, and Claude in production, focusing on reliability, accuracy, security, and cost-effectiveness. The course aims to provide systems-level thinking for building production-ready LLM applications with clear explanations, examples, diagrams, and implementations.
25

See all MLOps archives