article thumbnail

Build efficient, cross-Regional, I/O-intensive workloads with Dask on AWS

AWS Big Data

Dataset Variables Disk Size Xarray Dataset Size Region ERA5 2011–2020 (120 netcdf files) 53.5GB 364.1 ERA5 ( historic_temp_regridded ) us-east-1 1512 711 427 202 Difference ( propogated pool ) us-west-2 and us-east-1 1527 906 469 251 The following graph visualizes the performance and scale.

article thumbnail

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

So much data is flowing through the other parts, but that’s not the concern of DG solutions. Fun fact: in 2011 Google bought remnants of what had previously been Motorola. WhereHows is a DG project from LinkedIn, focused on big data. Apache Atlas is a Hadoop-ish native reference implementation for Egeria.