Remove Consulting Remove Data Lake Remove Data Transformation Remove Workshop
article thumbnail

Build a data lake with Apache Flink on Amazon EMR

AWS Big Data

The Amazon EMR Flink CDC connector reads the binlog data and processes the data. Transformed data can be stored in Amazon S3. We use the AWS Glue Data Catalog to store the metadata such as table schema and table location. Verify all table metadata is stored in the AWS Glue Data Catalog.