article thumbnail

AWS Lake Formation 2022 year in review

AWS Big Data

In this post, we are excited to summarize the features that the AWS Glue Data Catalog, AWS Glue crawler, and Lake Formation teams delivered in 2022. Whether you are a data platform builder, data engineer, data scientist, or any technology leader interested in data lake solutions, this post is for you.

article thumbnail

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

Data architect Armando Vázquez identifies eight common types of data architects: Enterprise data architect: These data architects oversee an organization’s overall data architecture, defining data architecture strategy and designing and implementing architectures. Are data architects in demand?

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Near-real-time analytics using Amazon Redshift streaming ingestion with Amazon Kinesis Data Streams and Amazon DynamoDB

AWS Big Data

Amazon Redshift is a fully managed, scalable cloud data warehouse that accelerates your time to insights with fast, easy, and secure analytics at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries, making it the widely used cloud data warehouse.

article thumbnail

Please vote before May 11! 2022 DBTA Reader’s Choice Awards

erwin

Please help us keep our #1 position in 2022. In data warehousing, the data is extracted and transported from production database(s) into a data warehouse for reporting and analysis. Best Data Modeling Solution (erwin Data Modeler). Read more about erwin® Data Modeler by Quest.

article thumbnail

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more.

article thumbnail

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

Many customers run big data workloads such as extract, transform, and load (ETL) on Apache Hive to create a data warehouse on Hadoop. We split the solution into two primary components: generating Spark job metadata and running the SQL on Amazon EMR. The script generates a metadata JSON file for each step.

article thumbnail

10 Years Later: Who’s the GOAT of Data Catalogs?

Alation

June 2017: Dresner Advisory Services names Alation the #1 data catalog in its inaugural Data Catalog End-User Market Study. August 2017: Alation debuts as a leader in the Gartner MQ for Metadata Management Solutions. August 2018: Gartner names Alation a 2X Leader in the MQ for Metadata Management Solutions.