Remove 2009 Remove Knowledge Discovery Remove Measurement Remove Risk
article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

This renders measures like classification accuracy meaningless. This carries the risk of this modification performing worse than simpler approaches like majority under-sampling. The use of multiple measurements in taxonomic problems. Chawla et al. Indeed, in the original paper Chawla et al. Dua, D., & Graff, C. Quinlan, J.

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

This dataset classifies customers based on a set of attributes into two credit risk groups – good or bad. This is to be expected, as there is no reason for a perfect 50:50 separation of the good vs. bad credit risk. but it generally relies on measuring the entropy in the change of predictions given a perturbation of a feature.

Modeling 139