Remove reproducible-data-science
article thumbnail

Addressing Irreproducibility in the Wild

Domino Data Lab

This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer ’s “ The Ingredients of a Reproducible Machine Learning Model ” talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at Northwestern University.

article thumbnail

The Role of Containers on MLOps and Model Production

Domino Data Lab

Container technology has changed the way data science gets done. The original container use case for data science focused on what I call, “environment management”. Configuring software environments is a constant chore, especially in the open source software space, the space in which most data scientists work.

Modeling 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Getting Started with Machine Learning

Cloudera

However, to understand what Ethical AI is, we need to have at least a basic understanding of ML, ML models and the data science lifecycle and how they are related. This blog post hopes to provide this foundational understanding. Instead, they are learned by training a model on data. What is Machine Learning.

article thumbnail

Of Muffins and Machine Learning Models

Cloudera

Model Reproducibility. In the case of CDP Public Cloud, this includes virtual networking constructs and the data lake as provided by a combination of a Cloudera Shared Data Experience (SDX) and the underlying cloud storage. Within the CML data service, model lineage is managed and tracked at a project level by the SDX.

article thumbnail

MNIST Expanded: 50,000 New Samples Added

Domino Data Lab

Many data scientists and researchers have used the MNIST test set of 10,000 samples for training and testing models for over 20 years. Sharing the process increases the likelihood of reproducibility and building off of existing work within the industry as a whole. Can we trust any new conclusion drawn from this data set?

Testing 83
article thumbnail

The Role of Model Governance in Machine Learning and Artificial Intelligence

Domino Data Lab

This includes: Model lineage, from data acquisition to model building Model versions in production, as they are updated based on new data Model health in production with model monitoring principles Model usage and basic functionality in production Model costs. First is the data the model is using. How Model Governance Works.

article thumbnail

Open Data Science and Machine Learning for Business with Cloudera Data Science Workbench on HDP

Cloudera

It’s official – Cloudera and Hortonworks have merged , and today I’m excited to announce the availability of Cloudera Data Science Workbench (CDSW) for Hortonworks Data Platform (HDP). Trusted by large data science teams across hundreds of enterprises —. Sound familiar? What is CDSW?