Remove 2008 Remove Knowledge Discovery Remove Strategy Remove Testing
article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

propose a different strategy where the minority class is over-sampled by generating synthetic examples. Their tests are performed using C4.5-generated note that this variant “performs worse than plain under-sampling based on AUC” when tested on the Adult dataset (Dua & Graff, 2017). In their 2002 paper Chawla et al.