Best of Data ScienceMay 2025

  1. 1
    Article
    Avatar of tdsTowards Data Science·1y

    Why I stopped Using Cursor and Reverted to VSCode

    The author details their decision to revert from using Cursor to VSCode as their primary IDE, citing updated features in GitHub Copilot, cost-effectiveness, and familiarity from prior use. Key considerations include improved compatibility with Jupyter Notebooks and the new availability of advanced LLMs in VSCode. Emphasis is placed on the rapid development pace of GitHub Copilot and Microsoft's resources to enhance functionality, closing the gap with competitors like Cursor.

  2. 2
    Article
    Avatar of freecodecampfreeCodeCamp·1y

    Master Database Management Systems

    Learn the essentials of Database Management Systems with an in-depth course from freeCodeCamp.org. The course covers foundational concepts, SQL, database design, and transaction processing using practical examples. It's suitable for exams and technical interviews, equipping students and professionals to efficiently handle data across various applications.

  3. 3
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·1y

    Build an MCP Server in 3 Steps

    This post describes a simple three-step process to build an MCP server using tools like Gitingest and Google AI Studio, enabling the transformation of FastMCP repository data into LLM-readable text. It also highlights the capabilities of the Firecrawl framework, which converts websites into structured formats for AI applications.

  4. 4
    Article
    Avatar of mlmMachine Learning Mastery·1y

    Roadmap to Python in 2025

    Python remains a cornerstone for data science and machine learning in 2025. The post provides a roadmap for learning Python, from basics to advanced machine learning applications, tailored to different proficiency levels. It emphasizes the importance of mastering modern Python features, foundational data science libraries such as NumPy and Pandas, and machine learning frameworks like TensorFlow and PyTorch. The roadmap also highlights specialized tracks for data engineering, AI, web development, and emerging technologies. Staying updated with Python's evolution and leveraging AI tools can further enhance development efficiency and effectiveness.

  5. 5
    Article
    Avatar of mlmMachine Learning Mastery·1y

    10 Underrated Books for Mastering Machine Learning

    Explore ten underrated books that delve deeper into machine learning theory and practice. These books range from mathematical foundations to practical applications, aiding in the advancement of your understanding of Bayesian methods, statistical learning, and deep learning frameworks.

  6. 6
    Article
    Avatar of lpythonLearn Python·1y

    Roadmap to Master Python Programming 🔰

    This roadmap provides a comprehensive guide for mastering Python programming, covering fundamentals, intermediate concepts, data structures and algorithms, libraries and tools, web development frameworks, advanced topics, projects, and interview preparation. It offers a structured approach to learning Python, from syntax and OOP to web development with frameworks like Flask and Django, and highlights real-world applications and job preparation strategies.

  7. 7
    Article
    Avatar of csharpcornerC# Corner·52w

    Understanding the Mathematics Behind Machine Learning

    Machine learning leverages key mathematical concepts like linear algebra, multivariate calculus, and dimensionality reduction techniques such as PCA to optimize models and analyze data. These disciplines facilitate data representation through vectors and matrices, improve model performance using gradients and derivatives, and simplify complex datasets using eigenvalues and eigenvectors. Understanding these concepts allows for more efficient algorithms and improved machine learning practices.

  8. 8
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·52w

    An Animated Guide to KMeans

    The post provides an animated guide to understanding the KMeans clustering algorithm, explaining data point assignment, centroid reassignment, and algorithm convergence. It highlights steps involved in KMeans, including the random selection of centroids, and offers insights into building intuition around the algorithm's operation.

  9. 9
    Article
    Avatar of tdsTowards Data Science·51w

    May Must-Reads: Math for Machine Learning Engineers, LLMs, Agent Protocols, and More

    A monthly roundup of popular machine learning and data science articles covering essential math skills for ML engineers, beginner guides to LLMs and RAG, software engineering concepts like inheritance, agent communication protocols, Model Context Protocol, PyTorch applications, healthcare ML projects, and time series forecasting techniques. The collection also introduces new authors contributing to the data science community.

  10. 10
    Article
    Avatar of googledevsGoogle Developers·1y

    Fully Reimagined: AI-First Google Colab

    Google Colab has been reimagined with an AI-first approach, introduced at Google I/O, featuring an agentic collaborator powered by Gemini 2.5 Flash. This enhancement allows for deeper integration into coding workflows, enabling users to solve complex coding problems more efficiently. The update includes improved functionalities and unified AI experiences across notebooks, aiming to lower barriers for data insights and accelerate coding processes.

  11. 11
    Article
    Avatar of tdsTowards Data Science·1y

    Agentic AI 101: Starting Your Journey Building AI Agents

    Explore the fundamentals of creating AI agents using large language models (LLMs). The post introduces various tools, including Python packages like Agno, for interacting with models such as Gemini. It covers creating simple agents to more advanced ones with reasoning, tools, memory, and knowledge integration. The guide aims to offer a pathway to develop AI agents efficiently, leveraging APIs and various toolsets for enhanced interaction and automation.

  12. 12
    Article
    Avatar of mlmMachine Learning Mastery·52w

    10 Python Libraries That Speed Up Model Development

    Python offers numerous libraries that streamline machine learning model development by automating complex tasks and enhancing workflows. The post highlights ten key libraries, including Scikit-learn for rapid prototyping, Pandas for data manipulation, XGBoost and LightGBM for fast model training, and TensorFlow with Keras or PyTorch for deep learning. These tools enable faster innovation through easier data management, visualization, and model tracking.

  13. 13
    Article
    Avatar of duckdbDuckDB·1y

    Machine Learning Prototyping with DuckDB and scikit-learn

    This post explores how DuckDB, an efficient data management system, complements scikit-learn, a popular machine learning library, in developing a species prediction model using the Palmer Penguins dataset. Key steps include data preprocessing with DuckDB, model training using a Random Forest classifier, and three inference methods to achieve predictions: using Pandas, DuckDB UDF row by row, and DuckDB batch style. Performance implications of UDFs are discussed, highlighting their utility despite slower execution times compared to Pandas.