Remove Data Collection Remove Knowledge Discovery Remove Measurement Remove Visualization
article thumbnail

On the Hunt for Patterns: from Hippocrates to Supercomputers

Ontotext

These are the so-called supercomputers, led by a smart legion of researchers and practitioners in the fields of data-driven knowledge discovery. Thanks to their might, now scientists and practitioners can develop innovative ways of collecting, storing, processing, and, ultimately, finding patterns in data.

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

Skater provides a wide range of algorithms that can be used for visual interpretation (e.g. but it generally relies on measuring the entropy in the change of predictions given a perturbation of a feature. Courville, Pascal Vincent, Visualizing Higher-Layer Features of a Deep Network, 2009. See Wei et al. Ribeiro, M. Guestrin, C.,

Modeling 139
article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Working with highly imbalanced data can be problematic in several aspects: Distorted performance metrics — In a highly imbalanced dataset, say a binary dataset with a class ratio of 98:2, an algorithm that always predicts the majority class and completely ignores the minority class will still be 98% correct. Machine Learning, 57–78.