A technique using bagging (bootstrap aggregating) as a regularizer is explored, where instead of training a single Gradient Boosted Decision Tree, 100 smaller GBDTs are trained on heavily subsampled data. The spread of predictions across these models approximates the uncertainty of each prediction. By penalizing uncertain estimates (e.g., using the 20th percentile of predictions), false positives can be reduced — particularly useful in recommendation systems where conservative, high-confidence predictions are preferred over risky ones.
Sort: