Consulting, Data Analytics, Data Lake and Metadata

Consulting

Data Analytics

Data Lake

Metadata

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

We split the solution into two primary components: generating Spark job metadata and running the SQL on Amazon EMR. The first component (metadata setup) consumes existing Hive job configurations and generates metadata such as number of parameters, number of actions (steps), and file formats. sql_path SQL file name.

Metadata

Metadata Testing Data Lake Consulting

How HR&A uses Amazon Redshift spatial analytics on Amazon Redshift Serverless to measure digital equity in states across the US

AWS Big Data

DECEMBER 5, 2023

HR&A Advisors —a multi-disciplinary consultancy with extensive work in the broadband and digital equity space is helping its state, county, and municipal clients deliver affordable internet access by analyzing locally specific digital inclusion needs and building tailored digital equity plans. Solutions Architect at Amazon Web Services.

Measurement

Measurement Dashboards Data Warehouse Analytics

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

The Madness of Data (and analytics) Governance

Andrew White

DECEMBER 9, 2019

The client had recently engaged with a well-known consulting company that had recommended a large data catalog effort to collect all enterprise metadata to help identify all data and business issues. Modern data (and analytics) governance does not necessarily need: Wall-to-wall discovery of your data and metadata.

Analytics

Analytics Data Lake Data Governance Metadata

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Streaming jobs constantly ingest new data to synchronize across systems and can perform enrichment, transformations, joins, and aggregations across windows of time more efficiently. With a file system sink connector, Apache Flink jobs can deliver data to Amazon S3 in open format (such as JSON, Avro, Parquet, and more) files as data objects.

Data Lake

Data Lake Unstructured Data Management Modeling

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Optimization Statistics

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

HBL started their data journey in 2019 when data lake initiative was started to consolidate complex data sources and enable the bank to use single version of truth for decision making. CDP Private Cloud’s new approach to data management and analytics would allow HBL to access powerful self-service analytics.

Management

Management Data Lake Consulting Unstructured Data

Tackling AI’s data challenges with IBM databases on AWS

IBM Big Data Hub

MARCH 14, 2024

The solution: IBM databases on AWS To solve for these challenges, IBM’s portfolio of SaaS database solutions on Amazon Web Services (AWS), enables enterprises to scale applications, analytics and AI across the hybrid cloud landscape.

Cost-Benefit

Cost-Benefit Metadata Optimization Management

Introducing watsonx: The future of AI for business

IBM Big Data Hub

MAY 9, 2023

A data store built on open lakehouse architecture, it runs both on premises and across multi-cloud environments. Optimized for all data, analytics and AI workloads, watsonx.data combines the flexibility of a data lake with the performance of a data warehouse, helping businesses to scale data analytics and AI anywhere their data resides.

Data Warehouse

Data Warehouse Cost-Benefit Machine Learning Modeling

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

I mention this here because there was a lot of overlap between current industry data governance needs and what the scientific community is working toward for scholarly infrastructure. The gist is, leveraging metadata about research datasets, projects, publications, etc., Data governance, for the win! Nothing Spreads Like Fear”.

Data Science

Data Science Machine Learning Data Governance Statistics

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

As with any good consulting response, “it depends.” Do you recommend a consulting approach strategy rather than a CDO strategy? Note: Delivery of data, analytics solutions and the sustainment of technology, data and services is a question. Data lakes don’t offer this nor should they. It really does.

Data Analytics

Data Analytics Analytics Data-driven Finance

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To bring their customers the best deals and user experience, smava follows the modern data architecture principles with a data lake as a scalable, durable data store and purpose-built data stores for analytical processing and data consumption.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Data Leaders Brief

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

How HR&A uses Amazon Redshift spatial analytics on Amazon Redshift Serverless to measure digital equity in states across the US

Webinars

Trending Sources

The Madness of Data (and analytics) Governance

Webinars

Exploring real-time streaming for generative AI Applications

Choosing an open table format for your transactional data lake on AWS

Habib Bank manages data at scale with Cloudera Data Platform

Tackling AI’s data challenges with IBM databases on AWS

Introducing watsonx: The future of AI for business

Themes and Conferences per Pacoid, Episode 12

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Stay Connected