article thumbnail

To Balance or Not to Balance?

The Unofficial Google Data Science Blog

In an ideal world, experimentation through randomization of the treatment assignment allows the identification and consistent estimation of causal effects. Other estimators, such as those based on matching and subclassification, may benefit from the balancing property, but the discussion of those estimators is postponed to a later post.

article thumbnail

Unintentional data

The Unofficial Google Data Science Blog

We data scientists now have access to tools that allow us to run a large numbers of experiments, and then to slice experimental populations by any combination of dimensions collected. Make experimentation cheap and understand the cost of bad decisions. This leads to the proliferation of post hoc hypotheses.