article thumbnail

On the Hunt for Patterns: from Hippocrates to Supercomputers

Ontotext

These are the so-called supercomputers, led by a smart legion of researchers and practitioners in the fields of data-driven knowledge discovery. As of 2017, the fastest computers have reached a speed of 93 PetaFLOPS, which is: 93×1015, or 93,000,000,000,000,000 operations per second. Certainly not!

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Figure 3 shows visual explanation of how SMOTE generates synthetic observations in this case. note that this variant “performs worse than plain under-sampling based on AUC” when tested on the Adult dataset (Dua & Graff, 2017). Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, 73–79.

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

Skater provides a wide range of algorithms that can be used for visual interpretation (e.g. Partial Dependence Plot is another visual method, which is model agnostic and can be successfully used to gain insights into the inner workings of a black-box model like a deep ANN. Conference on Knowledge Discovery and Data Mining, pp.

Modeling 139