Remove 2001 Remove Statistics Remove Strategy Remove Testing
article thumbnail

To Balance or Not to Balance?

The Unofficial Google Data Science Blog

A naïve way to solve this problem would be to compare the proportion of buyers between the exposed and unexposed groups, using a simple test for equality of means. Identification We now discuss formally the statistical problem of causal inference. We start by describing the problem using standard statistical notation.

article thumbnail

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

Consider the following timeline: 2001 – Physics grad students are getting hired in quantity by hedge funds to work on Wall St. Putting discussions about security aside, the statistics competency required to confront fairness and bias issues for machine learning models in production set quite a high bar. machine learning?

article thumbnail

Data Science at The New York Times

Domino Data Lab

A “data scientist” might build a multistage processing pipeline in Python, design a hypothesis test, perform a regression analysis over data samples with R, design and implement an algorithm in Hadoop, or communicate the results of our analyses to other members of the organization in a clear and concise fashion.