Best of Data Science — December 2025

1
Article
Simple Thread·25w
Getting Back to Basics
A hands-on exploration of building machine learning models from scratch, starting with a trading algorithm using regression trees that achieved 220% returns on historical stock data. The author then tackles energy demand forecasting by implementing a feed-forward neural network with backpropagation before upgrading to LSTM networks to handle temporal patterns. Key challenges include addressing gradient explosion through data scaling, switching from ReLU to tanh activation functions, and implementing the Adam optimizer. The final LSTM model with 50 neurons successfully predicts hourly energy interconnection flows without overfitting, demonstrating that foundational ML techniques remain powerful tools for practical time-series forecasting problems.
75
2
2
Article
Towards Data Science·23w
6 Technical Skills That Make You a Senior Data Scientist
Senior data scientists distinguish themselves through a structured six-stage workflow for building data products: mapping the business ecosystem, defining product constraints as operators, designing systems end-to-end before coding, starting with simple models and adding complexity only when justified, rigorously evaluating outputs through manual review and appropriate metrics, and tailoring communication to different audiences (product managers, engineers, other data scientists). The emphasis is on understanding context, making design-level trade-offs, and delivering production-ready solutions rather than just technical coding ability.
48
1
3
Article
Daily Dose of Data Science | Avi Chawla | Substack·22w
[Hands-on] Deploy and Run LLMs on your Phone!
Fine-tune and deploy LLMs directly on iOS and Android devices using UnslothAI, TorchAO, and ExecuTorch. The tutorial walks through loading Qwen3-0.6B, preparing reasoning and chat datasets, training with quantization-aware methods, exporting to mobile-ready .pte format, and running the model locally on iPhone at ~25 tokens/second. The resulting model is ~470MB and runs 100% on-device without requiring cloud connectivity.
37
4
Article
Last9·21w
Why High-Cardinality Metrics Break Everything
High-cardinality metrics promise granular per-request, per-user insights but quietly break production systems in four ways: costs become unpredictable and scale with runtime behavior rather than configuration; queries slow down during incidents when speed matters most; engineers lose trust as sparse, short-lived series create flickering dashboards and inconsistent results; and teams over-instrument without intent, creating multiplicative cardinality explosion. The core issue isn't that high-cardinality is wrong, but that most observability systems don't surface their own limits around storage, indexing, query performance, and data ambiguity. Success requires treating high-cardinality metrics like APIs with explicit ownership, guardrails, pre-deployment cardinality estimation, and systems designed for interactive exploration under pressure rather than brute-force scans.
12
1

See all Data Science archives