ML internals: Synthetic Minority Oversampling (SMOTE) Technique
Domino Data Lab
MAY 20, 2021
Working with highly imbalanced data can be problematic in several aspects: Distorted performance metrics — In a highly imbalanced dataset, say a binary dataset with a class ratio of 98:2, an algorithm that always predicts the majority class and completely ignores the minority class will still be 98% correct. In their 2002 paper Chawla et al.
Let's personalize your content