Remove visualizing-data-with-plotly-and-domino
article thumbnail

A Practitioner’s Guide to Deep Learning with Ludwig

Domino Data Lab

For example, there have been multiple promising tools created recently that have Python APIs, are built on top of TensorFlow or PyTorch , and encapsulate deep learning best practices to allow data scientists to speed up research. He summarized the magic of Ludwig into three things: The data type extraction. Introduction.

article thumbnail

How to supercharge data exploration with Pandas Profiling

Domino Data Lab

Producing insights from raw data is a time-consuming process. The Importance of Exploratory Analytics in the Data Science Lifecycle. Exploratory analysis is a critical component of the data science lifecycle. Predictive modeling efforts rely on dataset profiles , whether consisting of summary statistics or descriptive charts.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Bringing ML to Agriculture: Transforming a Millennia-old Industry

Domino Data Lab

Guest post by Jeff Melching, Distinguished Engineer / Chief Architect Data & Analytics. We’ve developed a model-driven software platform, called Climate FieldView , that captures, visualizes, and analyzes a vast array of data for farmers and provides new insight and personalized recommendations to maximize crop yield.

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Further, imbalanced data exacerbates problems arising from the curse of dimensionality often found in such biological data. Insufficient training data in the minority class — In domains where data collection is expensive, a dataset containing 10,000 examples is typically considered to be fairly large.

article thumbnail

Density-Based Clustering

Domino Data Lab

Cluster Analysis is an important problem in data analysis. Data scientists use clustering to identify malfunctioning servers, group genes with similar expression patterns, and perform various other applications. There are many families of data clustering algorithms, and you may be familiar with the most popular one: k-means.

Metrics 116
article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

For example, article 22 of the General Data Protection Regulation (GDPR) introduces the right of explanation – the power of an individual to demand an explanation on the reasons behind a model-based decision and to challenge the decision if it leads to a negative impact for the individual. According to Fox et al.,

Modeling 139
article thumbnail

Fitting Support Vector Machines via Quadratic Programming

Domino Data Lab

In this blog post we take a deep dive into the internals of Support Vector Machines. Figure 1 – There are infinitely many lines separating the two classes, but a good generalisation is achieved by the one that has the largest distance to the nearest data point of any class. 1999) and more. 1999) and more.