Data Collection, Knowledge Discovery, Testing and Visualization

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

MAY 20, 2021

Insufficient training data in the minority class — In domains where data collection is expensive, a dataset containing 10,000 examples is typically considered to be fairly large. Figure 3 shows visual explanation of how SMOTE generates synthetic observations in this case. Their tests are performed using C4.5-generated

Machine Learning

Machine Learning Metrics Data mining Knowledge Discovery

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

AUGUST 1, 2021

Skater provides a wide range of algorithms that can be used for visual interpretation (e.g. After forming the X and y variables, we split the data into training and test sets. Looking at the target vector in the training subset, we notice that our training data is highly imbalanced. 1 570 0 230 Name: credit, dtype: int64.

Modeling

Modeling Deep Learning Machine Learning Knowledge Discovery

Data Leaders Brief

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Explaining black-box models using attribute importance, PDPs, and LIME

Webinars

Stay Connected