Remove 2001 Remove Knowledge Discovery Remove Machine Learning Remove Testing
article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Machine Learning algorithms often need to handle highly-imbalanced datasets. Their tests are performed using C4.5-generated note that this variant “performs worse than plain under-sampling based on AUC” when tested on the Adult dataset (Dua & Graff, 2017). Machine Learning, 57–78. Chawla et al.,