article thumbnail

Speed up queries with the cost-based optimizer in Amazon Athena

AWS Big Data

Amazon Athena is a serverless, interactive analytics service built on open source frameworks, supporting open table file formats. Doing it before risks unnecessary aggregation overhead because each value is likely unique anyway and that step will not result in an earlier reduction in the amount of data transferred between intermediate stages.

article thumbnail

Reclaiming the stories that algorithms tell

O'Reilly on Data

In 2001, just as the Lexile system was rolling out state-wide, a professor of education named Stephen Krashen took to the pages of the California School Library Journal to raise an alarm. The report has pages of careful caveats, but in the end it treats these risk-adjusted ratios as a good measure of a surgeon’s performance.

Risk 355
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

Also, while surveying the literature two key drivers stood out: Risk management is the thin-edge-of-the-wedge ?for My read of that narrative arc is that some truly weird tensions showed up circa 2001: Arguably, it’s the heyday of DW+BI. A very big mess since circa 2001, and now becoming quite a dangerous mess. a second priority?at

article thumbnail

Data Science, Past & Future

Domino Data Lab

By virtue of that, if you take those log files of customers interactions, you aggregate them, then you take that aggregated data, run machine learning models on them, you can produce data products that you feed back into your web apps, and then you get this kind of effect in business. I can point to the year 2001. All righty.

article thumbnail

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

Consider the following timeline: 2001 – Physics grad students are getting hired in quantity by hedge funds to work on Wall St. The probabilistic nature changes the risks and process required. We face problems—crises—regarding risks involved with data and machine learning in production. To wit: data science is a team sport.

article thumbnail

Data Science at The New York Times

Domino Data Lab

In 2001, Bill Cleveland writes this article saying, “You are doing it wrong.” Here is a picture of The New York Times on its birthday in 1851, and for the vast majority of its lifespan this is pretty much what the user experience of interacting with The New York Times looks like. Editors can interact with this bot.

article thumbnail

Themes and Conferences per Pacoid, Episode 5

Domino Data Lab

What are the projected risks for companies that fall behind for internal training in data science? In terms of teaching and learning data science, Project Jupyter is probably the biggest news over the past decade – even though Jupyter’s origins go back to 2001! In business terms, why does this matter ?