article thumbnail

Density-Based Clustering

Domino Data Lab

Due to its importance in both theory and applications, this algorithm is one of three algorithms awarded the Test of Time Award at the KDD conference in 2014. Thanks to Scikit-Learn’s easy-to-use API, we can implement DBSCAN in only a couple lines of code: from sklearn.cluster import DBSCAN. Application.

Metrics 116
article thumbnail

Using Empirical Bayes to approximate posteriors for large "black box" estimators

The Unofficial Google Data Science Blog

by OMKAR MURALIDHARAN Many machine learning applications have some kind of regression at their core, so understanding large-scale regression systems is important. But most common machine learning methods don’t give posteriors, and many don’t have explicit probability models. Figure 4 shows the results of such a test.

KDD 40