article thumbnail

Misleading Statistics Examples – Discover The Potential For Misuse of Statistics & Data In The Digital Age

datapine

A 2009 investigative survey by Dr. Daniele Fanelli from The University of Edinburgh found that 33.7% Drinking tea increases diabetes by 50%, and baldness raises the cardiovascular disease risk up to 70%! In the image above, we can see a graph showing 77% of Christian Americans in 2009, a number that decreased to 65% in 2019.

article thumbnail

How Data Lineage Improves Data Compliance

Octopai

Banks didn’t accurately assess their credit and operational risk and hold enough capital reserves, leading to the Great Recession of 2008-2009. Data lineage and financial risk data compliance. All these models need to be informed by data, with operational risk assessment mandating loss data that goes back 10 years. .

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

Skater provides a wide range of algorithms that can be used for visual interpretation (e.g. This dataset classifies customers based on a set of attributes into two credit risk groups – good or bad. This is to be expected, as there is no reason for a perfect 50:50 separation of the good vs. bad credit risk.

Modeling 139
article thumbnail

Data Management Ensures Basel III and IV Compliance

Octopai

Do you ever feel like taking risks? . If you’re a bank, however, taking risks doesn’t just have implications for you, but for all your customers and (if you’re big enough) for the economy as a whole. . The Basel III framework, as well as Basel IV, call for regulation changes in multiple areas, including: Credit risk.

article thumbnail

Themes and Conferences per Pacoid, Episode 9

Domino Data Lab

That’s a risk in case, say, legislators – who don’t understand the nuances of machine learning – attempt to define a single meaning of the word interpret. On the other hand, as Lipton emphasized, while the tooling produces interesting visualizations, visualizations do not imply interpretation. St Paul’s from Madison London.

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Figure 3 shows visual explanation of how SMOTE generates synthetic observations in this case. This carries the risk of this modification performing worse than simpler approaches like majority under-sampling. Chawla et al. Indeed, in the original paper Chawla et al. UCI machine learning repository. link] Fisher, R. link] Halevy, A.

article thumbnail

Adding Common Sense to Machine Learning with TensorFlow Lattice

The Unofficial Google Data Science Blog

If we observe label vector y and feature vectors $x_1, cdots, x_d$ we can write the differentiable empirical risk minimization problem with a squared loss as$$ min_theta left| y - sum_{j=1}^d c_j(x_j) right|^2 $$Note that we use squared loss for the simplicity of presentation; one can use any differentiable loss in their application.