2009, Data Collection and Knowledge Discovery

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

MAY 20, 2021

Insufficient training data in the minority class — In domains where data collection is expensive, a dataset containing 10,000 examples is typically considered to be fairly large. The unreasonable effectiveness of data. Data mining for direct marketing: Problems and solutions. UCI machine learning repository.

Machine Learning

Machine Learning Metrics Data mining Knowledge Discovery

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

AUGUST 1, 2021

PDPs for the bicycle count prediction model (Molnar, 2009). The surrogate model is often a simple linear model or a decision tree, which are innately interpretable, so the data collected from the perturbations and the corresponding class output can provide a good indication on what influences the model’s decision.

Modeling

Modeling Deep Learning Machine Learning Knowledge Discovery

Data Leaders Brief

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Explaining black-box models using attribute importance, PDPs, and LIME

Webinars

Stay Connected