Top Data Leaders Brief Business Analytics Visualization Content for Week of Dec 27

Stochastic Gradient Boosting: Choosing the Best Number of Iterations

Data Science and Beyond

DECEMBER 28, 2014

In my summary of the Kaggle bulldozer price forecasting competition, I mentioned that part of my solution was based on stochastic gradient boosting. To reduce runtime, the number of boosting iterations was set by minimising the loss on the out-of-bag (OOB) samples, skipping trees where samples are in-bag. This approach was motivated by a bug in scikit-learn, where the OOB loss estimate was calculated on the in-bag samples, meaning that it always improved (and thus was useless for the purpose of

Forecasting

Forecasting IT

Data Leaders Brief

Sat.Dec 27, 2014 - Fri.Jan 02, 2015

Stochastic Gradient Boosting: Choosing the Best Number of Iterations

Webinars

Stay Connected