This post discusses a subtle bias that can impact decision trees and random forests. The bias can be eliminated by integrating out the effect of the choice of conditioning operators. The bias is more likely to occur when a feature domain contains highly probable equidistant values and when relatively deep trees are built.

10m read time From towardsdatascience.com
Post cover image
Table of contents
IntroductionA motivating exampleBinary decision tree induction and inferenceThe conditioning and the thresholdThe relation of conditioning and mirroringWhen can this happen?It is a bias, model selection cannot help!Mitigating the bias in random forestsConclusionFurther reading

Sort: