Remove 2009 Remove Data mining Remove Metrics Remove Risk
article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Further, imbalanced data exacerbates problems arising from the curse of dimensionality often found in such biological data. def get_neigbours(M, k): nn = NearestNeighbors(n_neighbors=k+1, metric="euclidean").fit(M) Here is a simplified version of the SMOTE algorithm: import random import pandas as pd import numpy as np.

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

Because of its architecture, intrinsically explainable ANNs can be optimised not just on its prediction performance, but also on its explainability metric. For this demo we’ll use the freely available Statlog (German Credit Data) Data Set, which can be downloaded from Kaggle. 1 570 0 570 Name: credit, dtype: int64.

Modeling 139