article thumbnail

Why you should care about debugging machine learning models

O'Reilly on Data

Not least is the broadening realization that ML models can fail. And that’s why model debugging, the art and science of understanding and fixing problems in ML models, is so critical to the future of ML. Because all ML models make mistakes, everyone who cares about ML should also care about model debugging. [1]

article thumbnail

How Insurance Companies Use Data To Measure Risk And Choose Rates

Smart Data Collective

Statistics show that married people have fewer car accidents than singletons. Insurance companies have access to crime statistics and can track the number of car theft and break-ins per neighborhood. Insurance companies have access to stats on what make and model of car is stolen more often or involved in more crashes.

Insurance 108
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The AIgent: Using Google’s BERT Language Model to Connect Writers & Representation

Insight

In 2013, Robert Galbraith?—?an The AIgent was built with BERT, Google’s state-of-the-art language model. In this article, I will discuss the construction of the AIgent, from data collection to model assembly. More relevant to the AIgent is Google’s BERT model, a task-agnostic (i.e. an aspiring author?—?finished

article thumbnail

Data Drift Detection for Image Classifiers

Domino Data Lab

This article covers how to detect data drift for models that ingest image data as their input in order to prevent their silent degradation in production. Introduction: preventing silent model degradation in production. This article explores an approach that can be used to detect data drift for models that classify/score image data.

article thumbnail

Themes and Conferences per Pacoid, Episode 5

Domino Data Lab

I’ve been teaching data science since 2008 privately for employers – exec staff, investors, IT teams, and the data teams I’ve led – and since 2013, for industry professionals in general. Also, clearly there’s no “one size fits all” educational model for data science. The Berkeley model addresses large university needs in the US.

article thumbnail

Towards optimal experimentation in online systems

The Unofficial Google Data Science Blog

If $Y$ at that point is (statistically and practically) significantly better than our current operating point, and that point is deemed acceptable, we update the system parameters to this better value. Figure 2: Spreading measurements out makes estimates of model (slope of line) more accurate. And sometimes even if it is not[1].)

article thumbnail

Credit Card Fraud Detection using XGBoost, SMOTE, and threshold moving

Domino Data Lab

We’ll use a gradient boosting technique via XGBoost to create a model and I’ll walk you through steps you can take to avoid overfitting and build a model that is fit for purpose and ready for production. Let’s also look at the basic descriptive statistics for all attributes. from sklearn import metrics.