article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Insufficient training data in the minority class — In domains where data collection is expensive, a dataset containing 10,000 examples is typically considered to be fairly large. Data mining for direct marketing: Problems and solutions. References. Banko, M., & Brill, E. link] Chawla, N. 30(2–3), 195–215. Quinlan, J.