Daily Dose of DS offers a daily dose of inspiration, education, and motivation for data scientists and aspiring data professionals. Through bite-sized articles, tutorials, and curated resources, readers embark on a journey to master the art and science of data analysis, machine learning, and artificial intelligence. By staying updated with the latest trends, techniques, and tools in data science, readers can hone their skills and stay ahead in this rapidly evolving field.

Daily Dose of Data Science | Avi Chawla | Substack

A practical guide on correctly using train, validation, and test sets in machine learning. Covers the validation overfitting problem that arises from repeated tuning, and recommends k-fold cross-validation and nested cross-validation as solutions. Explains that the test set should only be used once for final unbiased evaluation, never for model selection or hyperparameter tuning. Also addresses common pitfalls including data leakage during preprocessing, temporal data splits, stratification for imbalanced datasets, and group-based splits to prevent memorization of group-specific patterns.

How to Actually Use Train, Validation, and Test Sets in ML

What are RL environments, and how to build them

How to actually use train, validation, and test sets