article thumbnail

Towards optimal experimentation in online systems

The Unofficial Google Data Science Blog

If $Y$ at that point is (statistically and practically) significantly better than our current operating point, and that point is deemed acceptable, we update the system parameters to this better value. Crucially, it takes into account the uncertainty inherent in our experiments.

article thumbnail

Our quest for robust time series forecasting at scale

The Unofficial Google Data Science Blog

Quantification of forecast uncertainty via simulation-based prediction intervals. In the first plot, the raw weekly actuals (in red) are adjusted for a level change in September 2011 and an anomalous spike near October 2012. Prediction Intervals A statistical forecasting system should not lack uncertainty quantification.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Science, Past & Future

Domino Data Lab

He was saying this doesn’t belong just in statistics. It involved a lot of work with applied math, some depth in statistics and visualization, and also a lot of communication skills. I went to a meeting at Starbucks with the founder of Alation right before they launched in 2012, drawing on the proverbial back-of-the-napkin.

article thumbnail

Estimating the prevalence of rare events — theory and practice

The Unofficial Google Data Science Blog

But importance sampling in statistics is a variance reduction technique to improve the inference of the rate of rare events, and it seems natural to apply it to our prevalence estimation problem. Statistical Science. Statistics in Biopharmaceutical Research, 2010. [4] 5] Ray Chambers, Robert Clark (2012). How Many Strata?

Metrics 98
article thumbnail

Fitting Bayesian structural time series with the bsts R package

The Unofficial Google Data Science Blog

SCOTT Time series data are everywhere, but time series modeling is a fairly specialized area within statistics and data science. They may contain parameters in the statistical sense, but often they simply contain strategically placed 0's and 1's indicating which bits of $alpha_t$ are relevant for a particular computation. by STEVEN L.

article thumbnail

Estimating causal effects using geo experiments

The Unofficial Google Data Science Blog

Statistical power is traditionally given in terms of a probability function, but often a more intuitive way of describing power is by stating the expected precision of our estimates. This is a quantity that is easily interpretable and summarizes nicely the statistical power of the experiment. In the U.S., Cambridge, 2007.

article thumbnail

Using random effects models in prediction problems

The Unofficial Google Data Science Blog

We often use statistical models to summarize the variation in our data, and random effects models are well suited for this — they are a form of ANOVA after all. In the context of prediction problems, another benefit is that the models produce an estimate of the uncertainty in their predictions: the predictive posterior distribution.