2005, Data Quality, Modeling and Slice and Dice

Measuring Validity and Reliability of Human Ratings

The Unofficial Google Data Science Blog

JULY 18, 2023

If they roll two dice and apply a label if the dice rolls sum to 12 they will agree 85% of the time, purely by chance. Under certain conditions, the non-parametric and parametric measurements should be the same , and disagreements between the approaches should help illustrate how our assumptions about the data are correct or not.

Measurement

Measurement Metrics Uncertainty Slice and Dice

Data Leaders Brief

Measuring Validity and Reliability of Human Ratings

Webinars

Stay Connected