2012, Data Warehouse and Metadata

2012

Data Warehouse

Metadata

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

AWS Big Data

APRIL 25, 2024

As the queries finish running, an UNLOAD operation is invoked from the Redshift data warehouse to the S3 bucket in Account A. The pipeline then starts running stored procedures and SQL commands on Redshift Serverless.

Metadata

Metadata Data Processing Management Testing

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

Many customers run big data workloads such as extract, transform, and load (ETL) on Apache Hive to create a data warehouse on Hadoop. We split the solution into two primary components: generating Spark job metadata and running the SQL on Amazon EMR. The script generates a metadata JSON file for each step.

Metadata

Metadata Testing Data Lake Consulting

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

10 Years Later: Who’s the GOAT of Data Catalogs?

Alation

DECEMBER 15, 2022

December 2012: Alation forms and goes to work creating the first enterprise data catalog. Later, in its inaugural report on data catalogs, Forrester Research recognizes that “Alation started the MLDC trend.”. August 2017: Alation debuts as a leader in the Gartner MQ for Metadata Management Solutions.

Metadata

Metadata Data Governance Data Quality Marketing

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Convergent Evolution

Peter James Thomas

AUGUST 18, 2018

That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes. Even back then, these were used for activities such as Analytics , Dashboards , Statistical Modelling , Data Mining and Advanced Visualisation. This required additional investments in metadata. This is the essence of Convergent Evolution.

Data Lake

Data Lake Data Warehouse Data mining Statistics

How SumUp made digital analytics more accessible using AWS Glue

AWS Big Data

JUNE 6, 2023

Founded in 2012, SumUp is the financial partner for more than 4 million small merchants in over 35 markets worldwide, helping them start, run and grow their business. Unless, of course, the rest of their data also resides in the Google Cloud. This is a guest blog post by Mira Daniels and Sean Whitfield from SumUp.

Analytics

Analytics Data Lake Testing Optimization

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Data governance shows up as the fourth-most-popular kind of solution that enterprise teams were adopting or evaluating during 2019. That’s a lot of priorities – especially when you group together closely related items such as data lineage and metadata management which rank nearby. We keep feeding the monster data.

Data Governance

Data Governance Machine Learning Metadata Big Data

Best practices to implement near-real-time analytics using Amazon Redshift Streaming Ingestion with Amazon MSK

AWS Big Data

MARCH 11, 2024

Amazon Redshift is a fully managed, scalable cloud data warehouse that accelerates your time to insights with fast, straightforward, and secure analytics at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries, making it the most widely used cloud data warehouse.

Analytics

Analytics Data Warehouse Optimization Metrics

How BMO improved data security with Amazon Redshift and AWS Lake Formation

AWS Big Data

MARCH 1, 2024

One of the bank’s key challenges related to strict cybersecurity requirements is to implement field level encryption for personally identifiable information (PII), Payment Card Industry (PCI), and data that is classified as high privacy risk (HPR). Only users with required permissions are allowed to access data in clear text.

Data Lake

Data Lake Data Warehouse Management Risk

Data Science, Past & Future

Domino Data Lab

JULY 22, 2019

The data governance, however, is still pretty much over on the data warehouse. Toward the end of the 2000s is when you first started getting teams and industry, as Josh Willis was showing really brilliantly last night, you first started getting some teams identified as “data science” teams.

Data Science

Data Science Machine Learning Data Governance Modeling

Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On

AWS Big Data

NOVEMBER 30, 2023

This integration simplifies the authentication and authorization process for Amazon Redshift users using Query Editor V2 or Amazon Quicksight , making it easier for them to securely access your data warehouse. Note: Your organization’s IdC instance must be in the same region as the Amazon Redshift data warehouse you’re connecting to.

Data Warehouse

Data Warehouse Finance Sales Management

Single sign-on with Amazon Redshift Serverless with Okta using Amazon Redshift Query Editor v2 and third-party SQL clients

AWS Big Data

MAY 4, 2023

Amazon Redshift Serverless makes it easy to run and scale analytics in seconds without the need to set up and manage data warehouse clusters. Customers use their preferred SQL clients to analyze their data in Redshift Serverless. An Redshift Serverless data warehouse. If you don’t have one, you can sign up for one.

Finance

Finance Data Warehouse Sales Metadata

Data Leaders Brief

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

Webinars

Trending Sources

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Webinars

10 Years Later: Who’s the GOAT of Data Catalogs?

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Convergent Evolution

How SumUp made digital analytics more accessible using AWS Glue

Themes and Conferences per Pacoid, Episode 8

Best practices to implement near-real-time analytics using Amazon Redshift Streaming Ingestion with Amazon MSK

How BMO improved data security with Amazon Redshift and AWS Lake Formation

Data Science, Past & Future

Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On

Single sign-on with Amazon Redshift Serverless with Okta using Amazon Redshift Query Editor v2 and third-party SQL clients

Stay Connected