article thumbnail

On the Hunt for Patterns: from Hippocrates to Supercomputers

Ontotext

These are the so-called supercomputers, led by a smart legion of researchers and practitioners in the fields of data-driven knowledge discovery. The capacity and performance of supercomputers is measured with the so-called FLOPS (floating point operations per second). What are supercomputers and why do we need them?

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

This renders measures like classification accuracy meaningless. note that this variant “performs worse than plain under-sampling based on AUC” when tested on the Adult dataset (Dua & Graff, 2017). The use of multiple measurements in taxonomic problems. Indeed, in the original paper Chawla et al. Machine Learning, 57–78.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

but it generally relies on measuring the entropy in the change of predictions given a perturbation of a feature. In IJCAI 2017 Workshop on Explainable Artificial Intelligence (XAI), pages 24–30, Melbourne, Australia, 2017. Conference on Knowledge Discovery and Data Mining, pp. See Wei et al. References.

Modeling 139
article thumbnail

Changing assignment weights with time-based confounders

The Unofficial Google Data Science Blog

For this reason we don’t report uncertainty measures or statistical significance in the results of the simulation. Ramp-up solution: measure epoch and condition on its effect If one wants to do full traffic ramp-up and use data from all epochs, they must use an adjusted estimator to get an unbiased estimate of the average reward in each arm.