Remove Data Collection Remove Data mining Remove Knowledge Discovery Remove Modeling
article thumbnail

Fundamentals of Data Mining

Data Science 101

This data alone does not make any sense unless it’s identified to be related in some pattern. Data mining is the process of discovering these patterns among the data and is therefore also known as Knowledge Discovery from Data (KDD). Machine learning provides the technical basis for data mining.

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

In this article we discuss why fitting models on imbalanced datasets is problematic, and how class imbalance is typically addressed. Insufficient training data in the minority class — In domains where data collection is expensive, a dataset containing 10,000 examples is typically considered to be fairly large.

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

In this article we cover explainability for black-box models and show how to use different methods from the Skater framework to provide insights into the inner workings of a simple credit scoring neural network model. The interest in interpretation of machine learning has been rapidly accelerating in the last decade. See Ribeiro et al.

Modeling 139