Daily Dose of DS offers a daily dose of inspiration, education, and motivation for data scientists and aspiring data professionals. Through bite-sized articles, tutorials, and curated resources, readers embark on a journey to master the art and science of data analysis, machine learning, and artificial intelligence. By staying updated with the latest trends, techniques, and tools in data science, readers can hone their skills and stay ahead in this rapidly evolving field.

Daily Dose of Data Science | Avi Chawla | Substack

Euclidean distance fails to account for data distribution and feature correlation, making it unreliable for tasks like outlier detection. Mahalanobis distance solves this by transforming data into uncorrelated variables with unit variance before computing distance, effectively considering the underlying distribution. The post also covers cost-complexity pruning (CCP) in decision trees, which prevents overfitting by balancing classification cost against tree complexity, and explains how bagging reduces variance through bootstrap sampling and model aggregation.

Euclidean Distance vs. Mahalanobis Distance

From Models to Metal Mayhem @AWS re:Invent

Cost complexity pruning in decision trees