Remove 2017 Remove Data Collection Remove Measurement Remove Visualization
article thumbnail

Themes and Conferences per Pacoid, Episode 9

Domino Data Lab

The lens of reductionism and an overemphasis on engineering becomes an Achilles heel for data science work. Instead, consider a “full stack” tracing from the point of data collection all the way out through inference. Finale Doshi-Velez, Been Kim (2017-02-28) ; see also the Domino blog article about TCAV. 2018-06-21).

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

Skater provides a wide range of algorithms that can be used for visual interpretation (e.g. but it generally relies on measuring the entropy in the change of predictions given a perturbation of a feature. In IJCAI 2017 Workshop on Explainable Artificial Intelligence (XAI), pages 24–30, Melbourne, Australia, 2017.

Modeling 139
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

On the Hunt for Patterns: from Hippocrates to Supercomputers

Ontotext

The capacity and performance of supercomputers is measured with the so-called FLOPS (floating point operations per second). As of 2017, the fastest computers have reached a speed of 93 PetaFLOPS, which is: 93×1015, or 93,000,000,000,000,000 operations per second. There are four types of data sources that the team will work with.

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Working with highly imbalanced data can be problematic in several aspects: Distorted performance metrics — In a highly imbalanced dataset, say a binary dataset with a class ratio of 98:2, an algorithm that always predicts the majority class and completely ignores the minority class will still be 98% correct. Machine Learning, 57–78.

article thumbnail

What Is Data Intelligence?

Alation

Data intelligence first emerged to support search & discovery, largely in service of analyst productivity. For years, analysts in enterprises had struggled to find the data they needed to build reports. This problem was only exacerbated by explosive growth in data collection and volume. HBR Review May/June 2017.

article thumbnail

Techniques for Collecting, Prepping, and Plotting Data: Predicting Social Media-Influence in the NBA

Domino Data Lab

As a result, there has been a recent explosion in individual statistics that try to measure a player’s impact. The first step to collecting all of the data is to figure out which data source to collect first, and where to get it. R has yet one more way to visualize these relationships in multiple dimensions.

article thumbnail

Themes and Conferences per Pacoid, Episode 6

Domino Data Lab

People who attended JupyterCon 2017–2018 can attest, an “industry poster session” includes an open bar, catered hors d’oeuvres, lots of mingling … to paraphrase feedback from JupyterCon, “As a tech person, would I get up extra early to meet strangers for coffee at 8:00 am? The ability to measure results (risk-reducing evidence).