Towards Data Science is a community-powered publication that showcases work in data science, machine learning and artificial intelligence. Every day newcomers, seasoned researchers and industry practitioners publish tutorials, research notes and real-world case studies that help the field move forward.

Towards Data Science

A walkthrough of exploratory data analysis (EDA) on a Kaggle credit scoring dataset with 32,581 observations and 12 variables. Each variable — including borrower age, income, employment length, home ownership, loan grade, interest rate, and prior default history — is analyzed for its distribution and relationship to default risk. Continuous variables are discretized into quartile-based intervals. Key findings include: younger and lower-income borrowers default more, prior default history is a strong predictor, and higher loan grades correlate with lower default rates. Python code using pandas is provided to automate the summary tables and export them to Excel.

Exploratory Data Analysis for Credit Scoring with Python

Descriptive Statistics of the Modeling Dataset