Best of MLOpsJanuary 2026

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·16w

    Phases of ML Modeling

    ML systems should evolve through four distinct phases rather than jumping straight to complex models. Start with simple heuristics and rules (Phase 1), then move to basic ML models like logistic regression (Phase 2), optimize through feature engineering and hyperparameter tuning (Phase 3), and only adopt complex models like deep neural networks when simpler approaches are exhausted (Phase 4). This staged approach reduces risk, improves debuggability, and ensures each phase's best model becomes the baseline for the next, encouraging incremental progress and evidence-driven decision-making.

  2. 2
    Article
    Avatar of roadmaproadmap.sh·16w

    MLOps Roadmap has been updated!

    The roadmap.sh MLOps roadmap has been updated for 2026, providing a step-by-step guide for learning and mastering MLOps practices. The updated resource offers a structured learning path for those looking to develop skills in machine learning operations.

  3. 3
    Article
    Avatar of cncfCNCF·16w

    Introducing Kthena: LLM inference for the cloud native era

    Kthena is a new open-source sub-project of Volcano designed for LLM inference orchestration on Kubernetes. It addresses production challenges like low GPU/NPU utilization, latency-throughput tradeoffs, and multi-model management through intelligent routing, KV Cache-aware scheduling, and Prefill-Decode disaggregation. The system includes a high-performance router and controller manager that support topology-aware scheduling, gang scheduling, autoscaling, and multiple inference engines (vLLM, SGLang, Triton). Benchmarks show 2.73x throughput improvement and 73.5% TTFT reduction compared to random routing. Backed by Huawei Cloud, China Telecom, DaoCloud, and other industry partners.

  4. 4
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·18w

    Foundations of AI Engineering and LLMOps

    Part 3 of an LLMOps course is now available, covering attention mechanisms, transformer architectures, mixture-of-experts, and the fundamentals of pretraining and fine-tuning with hands-on code demos. LLMOps extends traditional MLOps principles to address the unique engineering challenges of managing large language models like Llama, GPT, and Claude in production, focusing on reliability, accuracy, security, and cost-effectiveness. The course aims to provide systems-level thinking for building production-ready LLM applications with clear explanations, examples, diagrams, and implementations.