article thumbnail

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

AWS Big Data

This post discusses the most pressing needs when designing an enterprise-grade Data Vault and how those needs are addressed by Amazon Redshift in particular and AWS cloud in general. The first post in this two-part series discusses best practices for designing enterprise-grade data vaults of varying scale using Amazon Redshift.

article thumbnail

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

For the past 5 years, BMS has used a custom framework called Enterprise Data Lake Services (EDLS) to create ETL jobs for business users. BMS’s EDLS platform hosts over 5,000 jobs and is growing at 15% YoY (year over year). It retrieves the specified files and available metadata to show on the UI.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

Over the years, data lakes on Amazon Simple Storage Service (Amazon S3) have become the default repository for enterprise data and are a common choice for a large set of users who query data for a variety of analytics and machine leaning use cases. Launch the notebooks hosted under this link and unzip them on a local workstation.

Data Lake 100
article thumbnail

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

AWS Big Data

This approach promotes efficiency, flexibility, and scalability, enabling large enterprises to meet their evolving needs and achieve their goals. In the second account, Amazon MWAA is hosted in one VPC and Redshift Serverless in a different VPC, which are connected through VPC peering. secretsmanager ).

article thumbnail

Providing fine-grained, trusted access to enterprise datasets with Okera and Domino

Domino Data Lab

Additionally, Okera connects to a company’s existing technical and business metadata catalogs (such as Collibra), making it easy for data scientists to discover, access and utilize new, approved sources of information. For the compliance team, the combination of Okera and Domino Data Lab is extremely powerful.

article thumbnail

Gartner Data & Analytics Summit 2022 in London: 3 Key Takeaways

Alation

Active metadata gives you crucial context around what data you have and how to use it wisely. Active metadata provides the who, what, where, and when of a given asset, showing you where it flows through your pipeline, how that data is used, and who uses it most often. So how are leading enterprises walking that line?

article thumbnail

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Ontotext

Content and data management solutions based on knowledge graphs are becoming increasingly important across enterprises. ” With new business lines, leading to new tools, a lot of diverse and siloed data inevitably enters enterprise systems. The question is not how to avoid complexity but how to embrace it and take advantage of it.”