Best of Pandas2025

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·1y

    Pandas Mind Map

    A detailed mind map of various Pandas methods categorized by their operation types, including I/O methods, DataFrame creation, statistical information, renaming, plotting, time-series, grouping, pivot, and categorical data methods. Additional ML resources and techniques are also provided for developing industry-relevant skills.

  2. 2
    Video
    Avatar of TechWithTimTech With Tim·1y

    How To Automate Your Finances with Python - Full Tutorial (Pandas, Streamlit, Plotly & More)

    The post provides a step-by-step tutorial for building a personal finance automation tool using Python with libraries such as Pandas, Streamlit, and Plotly. The tool can upload bank statements in CSV format, categorize transactions, and summarize expenses using visualizations. It also explains how to convert bank statements into a format conducive to analysis and the importance of utilizing structured, project-based learning resources like Data Camp for Python and finance fundamentals.

  3. 3
    Article
    Avatar of tdsTowards Data Science·48w

    Building A Modern Dashboard with Python and Taipy

    Taipy is a Python web framework designed for data scientists and engineers to build production-ready dashboards without web development expertise. The tutorial demonstrates creating an interactive sales dashboard with filtering capabilities, key metrics display, multiple chart types, and raw data tables using 100,000 synthetic sales records from a CSV file. Taipy excels over Streamlit and Gradio when building complex, high-performance, enterprise-grade applications that require scalability and maintainability.

  4. 4
    Article
    Avatar of freecodecampfreeCodeCamp·46w

    How to Transform JSON Data to Match Any Schema

    Learn how to transform JSON data to match specific schemas using two approaches: pure Python and pandas. The tutorial covers loading JSON files, defining target schemas, cleaning and renaming fields, and validating the output. It demonstrates transforming customer records by removing unwanted fields and renaming others, while comparing performance between pure Python (faster for simple tasks) and pandas (better for complex datasets with built-in data cleaning methods).

  5. 5
    Article
    Avatar of tdsTowards Data Science·44w

    I Analysed 25,000 Hotel Names and Found Four Surprising Truths

    A data scientist analyzed 25,000 hotel names worldwide using the Hotel Data API to uncover why hotels are named after cities they're not located in. The study revealed that Paris is the most borrowed city name (1,100+ hotels), followed by Vienna and Rome. Three main reasons emerged: proximity for search visibility, branding to evoke luxury and sophistication, and historical tradition dating back to 18th-century aristocratic travel patterns. The analysis used Python, pandas, and geographic distance calculations to map naming patterns across countries.

  6. 6
    Article
    Avatar of tdsTowards Data Science·51w

    Building a Modern Dashboard with Python and Gradio

    A comprehensive guide to building an interactive sales performance dashboard using Gradio, a Python library for creating web applications. The tutorial covers setting up Gradio, processing CSV data with Pandas, implementing filtering capabilities, generating visualizations with Matplotlib, and creating a responsive interface with key metrics, charts, and data tables. The dashboard allows users to filter by date ranges and product categories while displaying revenue trends, top products, and raw data dynamically.

  7. 7
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·1y

    FireDucks vs. Pandas vs. DuckDB vs. Polars

    FireDucks is an optimized alternative to Pandas with the same API, requiring just an import replacement to use. It demonstrates a significant speed boost for big data operations, achieving an average speed-up of 125x over Pandas. FireDucks' lazy execution builds and optimizes a logical execution plan, unlike Pandas' immediate execution. It can be used with IPython, Jupyter Notebooks, or within existing Pandas pipelines by replacing import statements. Detailed benchmarks and usage examples are provided, showing substantial performance improvements in practical scenarios.

  8. 8
    Video
    Avatar of TechWithTimTech With Tim·42w

    Learn Pandas in 30 Minutes - Python Pandas Tutorial

    A comprehensive beginner tutorial covering pandas fundamentals including dataframe creation, data loading from CSV files, basic operations like head/tail/info, column and row indexing with iloc/loc, filtering data with conditions, updating and deleting entries, data cleaning methods, and basic analysis functions like groupby and value_counts. The tutorial demonstrates both regular Python files and Jupyter notebooks for data manipulation workflows.

  9. 9
    Article
    Avatar of jetbrainsJetBrains·1y

    Data Cleaning in Data Science

    Data cleaning is essential for transforming real-world, messy datasets into reliable sources for analysis or machine learning. This involves removing duplicates, dealing with implausible values, addressing formatting issues, outliers, and missing values. Proper data cleaning ensures that conclusions drawn from the data can be generalized to a defined population. Best practices include defining your population boundaries, ensuring reproducibility, and keeping methods well-documented.

  10. 10
    Article
    Avatar of mlmMachine Learning Mastery·45w

    7 Pandas Tricks That Cut Your Data Prep Time in Half

    Seven practical pandas techniques to accelerate data preparation workflows: chaining transformations with assign(), filling missing values using dictionaries in fillna(), flattening list columns with explode(), readable filtering with query(), named aggregations with groupby().agg(), date parsing with pd.to_datetime(), and building modular workflows with pipe(). These methods help reduce boilerplate code, improve readability, and streamline the data cleaning process.

  11. 11
    Article
    Avatar of lpythonLearn Python·47w

    🏋️ How I Built a Gym & Diet Progress Tracker Using Python

    A developer built a personal fitness tracking application using Python to monitor workouts, diet, and progress without relying on commercial apps. The project uses pandas for data management, matplotlib/seaborn for visualization, CSV files for storage, and optionally tkinter for GUI. It tracks daily workouts, nutrition data, body weight, and generates visual progress charts. Key challenges included creating flexible input formats, preventing data overwrites, and clear long-term progress visualization.

  12. 12
    Article
    Avatar of mlnewsMachine Learning News·1y

    Complete Guide: Working with CSV/Excel Files and EDA in Python

    This tutorial provides a comprehensive guide to working with CSV/Excel files and performing exploratory data analysis (EDA) using Python. It covers importing, cleaning, and preprocessing data, exploring data through statistics and visualization, and deriving insights from business data using libraries such as pandas, NumPy, matplotlib, and seaborn. The guide uses a realistic e-commerce dataset to demonstrate the entire workflow, including merging datasets and handling data quality issues.