Remove Data Governance Remove Data Lake Remove Data Processing Remove Machine Learning
article thumbnail

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake 107
article thumbnail

Announcing the 2021 Data Impact Awards

Cloudera

2020 saw us hosting our first ever fully digital Data Impact Awards ceremony, and it certainly was one of the highlights of our year. We saw a record number of entries and incredible examples of how customers were using Cloudera’s platform and services to unlock the power of data. DATA FOR ENTERPRISE AI.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

Paco Nathan ‘s latest column dives into data governance. This month’s article features updates from one of the early data conferences of the year, Strata Data Conference – which was held just last week in San Francisco. In particular, here’s my Strata SF talk “Overview of Data Governance” presented in article form.

article thumbnail

Governing data in relational databases using Amazon DataZone

AWS Big Data

Data governance is a key enabler for teams adopting a data-driven culture and operational model to drive innovation with data. Amazon DataZone allows you to simply and securely govern end-to-end data assets stored in your Amazon Redshift data warehouses or data lakes cataloged with the AWS Glue data catalog.

article thumbnail

How Novo Nordisk built distributed data governance and control at scale

AWS Big Data

The first post of this series describes the overall architecture and how Novo Nordisk built a decentralized data mesh architecture, including Amazon Athena as the data query engine. The third post will show how end-users can consume data from their tool of choice, without compromising data governance.

article thumbnail

AWS Glue crawlers support cross-account crawling to support data mesh architecture

AWS Big Data

Data lakes have come a long way, and there’s been tremendous innovation in this space. Today’s modern data lakes are cloud native, work with multiple data types, and make this data easily available to diverse stakeholders across the business.

article thumbnail

Announcing the 2020 Data Impact Award Winners

Cloudera

The technological linchpin of its digital transformation has been its Enterprise Data Architecture & Governance platform. It hosts over 150 big data analytics sandboxes across the region with over 200 users utilizing the sandbox for data discovery. times more effective than traditional mass marketing.