A practical guide to analyzing variable relationships in credit scoring models using Python. Covers graphical tools (boxplots, KDE, ECDF, contingency tables) and statistical tests for three relationship types: continuous vs. binary target (Kruskal-Wallis), categorical vs. binary target (chi-square + Cramér's V), and multicollinearity between predictors (Spearman correlation and Cramér's V matrix). Includes full Python code for each method and applies them to a real lending dataset, ultimately pre-selecting 9 variables by removing redundant ones.

25m read timeFrom towardsdatascience.com
Post cover image
Table of contents
1.1 Evaluation of Predictive Power1.2 Multicollinearity Between Variables1.3 Application in the real dataReferences

Sort: