Remove 2001 Remove 2009 Remove Strategy Remove Testing
article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

The problem with this approach is that in highly imbalanced sets it can easily lead to a situation where most of the data has to be discarded, and it has been firmly established that when it comes to machine learning data should not be easily thrown out (Banko and Brill, 2001; Halevy et al., Their tests are performed using C4.5-generated

article thumbnail

Data Science at The New York Times

Domino Data Lab

When he retired in 2009 he had some time on his hands. In 2001, Bill Cleveland writes this article saying, “You are doing it wrong.” ” Suddenly we had a full on backend team that was going to hit our API and then turn it into an ad strategy, and now you can go sell premium ads based on how an article makes you feel.