Remove 2012 Remove Data Lake Remove Metadata Remove Risk
article thumbnail

How BMO improved data security with Amazon Redshift and AWS Lake Formation

AWS Big Data

One of the bank’s key challenges related to strict cybersecurity requirements is to implement field level encryption for personally identifiable information (PII), Payment Card Industry (PCI), and data that is classified as high privacy risk (HPR). Only users with required permissions are allowed to access data in clear text.

Data Lake 101
article thumbnail

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

Instead, we can use automation to speed up the process of migration and reduce heavy lifting tasks, costs, and risks. We split the solution into two primary components: generating Spark job metadata and running the SQL on Amazon EMR. Generate Spark SQL metadata Our batch job consists of Hive steps scheduled to run sequentially.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

Data as a product Treating data as a product entails three key components: the data itself, the metadata, and the associated code and infrastructure. In this approach, teams responsible for generating data are referred to as producers.

article thumbnail

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

Data governance shows up as the fourth-most-popular kind of solution that enterprise teams were adopting or evaluating during 2019. That’s a lot of priorities – especially when you group together closely related items such as data lineage and metadata management which rank nearby. We keep feeding the monster data.

article thumbnail

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

I mention this here because there was a lot of overlap between current industry data governance needs and what the scientific community is working toward for scholarly infrastructure. The gist is, leveraging metadata about research datasets, projects, publications, etc., The probabilistic nature changes the risks and process required.

article thumbnail

How Novo Nordisk built distributed data governance and control at scale

AWS Big Data

In this example, the analytics tool accesses the data lake on Amazon Simple Storage Service (Amazon S3) through Athena queries. As the data mesh pattern expands across domains covering more downstream services, we need a mechanism to keep IdPs and IAM role trusts continuously updated.