article thumbnail

Experiment design and modeling for long-term studies in ads

The Unofficial Google Data Science Blog

by HENNING HOHNHOLD, DEIRDRE O'BRIEN, and DIANE TANG In this post we discuss the challenges in measuring and modeling the long-term effect of ads on user behavior. Nevertheless, A/B testing has challenges and blind spots, such as: the difficulty of identifying suitable metrics that give "works well" a measurable meaning.

article thumbnail

Changing assignment weights with time-based confounders

The Unofficial Google Data Science Blog

For this reason we don’t report uncertainty measures or statistical significance in the results of the simulation. Ramp-up solution: measure epoch and condition on its effect If one wants to do full traffic ramp-up and use data from all epochs, they must use an adjusted estimator to get an unbiased estimate of the average reward in each arm.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Using Empirical Bayes to approximate posteriors for large "black box" estimators

The Unofficial Google Data Science Blog

Posteriors are useful to understand the system, measure accuracy, and make better decisions. Methods like the Poisson bootstrap can help us measure the variability of $t$, but don’t give us posteriors either, particularly since good high-dimensional estimators aren’t unbiased.

KDD 40
article thumbnail

Explaining black-box models using attribute importance, PDPs, and LIME

Domino Data Lab

but it generally relies on measuring the entropy in the change of predictions given a perturbation of a feature. 2015) for additional details. Conference on Knowledge Discovery and Data Mining, pp. Neural machine translation by jointly learning to align and translate , ICLR, 2015. See Wei et al. Bahdanau, D.,

Modeling 139