How to Measure Dataset Similarity: Understanding the Impact of Drift on ML Models
Dataiku
JULY 5, 2022
Measuring similarity between two datasets is critical in many ML fields, such as detecting dataset shift and evaluating its impact on a model’s performance. This article describes various datasets’ similarity measures and how they can be leveraged for distribution shift detection and model performance drop.
Let's personalize your content