article thumbnail

Estimating the prevalence of rare events — theory and practice

The Unofficial Google Data Science Blog

There are many strategies we can use to estimate this quantity, and we will discuss each option in detail. When training a classifier with few positives in the population, one common strategy is to over sample items with positive labels, and/or down sample items with negative labels. This is straightforward and easy to implement.

Metrics 98