Decision trees tend to overfit by classifying all training instances perfectly, leading to poor generalization. Random Forest introduces randomness to mitigate this by creating a bootstrapped dataset and randomly selecting candidate features for node splitting. The ExTra Trees algorithm adds an additional layer of randomness by selecting split thresholds randomly, further reducing model variance. When using ExTra Trees in sklearn, ensure the `bootstrap` flag is set to `True` to avoid using the full dataset for each tree.

4m read timeFrom blog.dailydoseofds.com
Post cover image
Table of contents
Are you overwhelmed with the amount of information in ML/DS?
3 Comments

Sort: