Remove Data Collection Remove Data mining Remove Knowledge Discovery Remove Visualization
article thumbnail

Fundamentals of Data Mining

Data Science 101

This data alone does not make any sense unless it’s identified to be related in some pattern. Data mining is the process of discovering these patterns among the data and is therefore also known as Knowledge Discovery from Data (KDD). Machine learning provides the technical basis for data mining.

article thumbnail

ML internals: Synthetic Minority Oversampling (SMOTE) Technique

Domino Data Lab

Insufficient training data in the minority class — In domains where data collection is expensive, a dataset containing 10,000 examples is typically considered to be fairly large. Figure 3 shows visual explanation of how SMOTE generates synthetic observations in this case. 30(2–3), 195–215. link] Ling, C. X., & Li, C.

article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

Skater provides a wide range of algorithms that can be used for visual interpretation (e.g. Partial Dependence Plot is another visual method, which is model agnostic and can be successfully used to gain insights into the inner workings of a black-box model like a deep ANN. Conference on Knowledge Discovery and Data Mining, pp.

Modeling 139