article thumbnail

Use Amazon OpenSearch Ingestion to migrate to Amazon OpenSearch Serverless

AWS Big Data

Migration of metadata such as security roles and dashboard objects will be covered in another subsequent post. Update the following information for the source: Uncomment hosts and specify the endpoint of the existing OpenSearch Service endpoint. For now, you can leave the default minimum as 1 and maximum as 4.

article thumbnail

Amazon OpenSearch Service search enhancements: 2023 roundup

AWS Big Data

Now users seek methods that allow them to get even more relevant results through semantic understanding or even search through image visual similarities instead of textual search of metadata. Lexical search In lexical search, the search engine compares the words in the search query to the words in the documents, matching word for word.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Upgrade Hortonworks Data Platform (HDP) to Cloudera Data Platform (CDP) Private Cloud Base

Cloudera

Before proceeding with the upgrade, review the CDP Private Cloud Base prerequisites as specified in the documentation. Finally we also recommend that you take a full backup of your cluster configurations, metadata, other supporting details, and backend databases. The end-to-end process is relatively straightforward and well documented.

Testing 95
article thumbnail

Configure cross-Region table access with the AWS Glue Catalog and AWS Lake Formation

AWS Big Data

This feature lets users query AWS Glue databases and tables in one Region from another Region using resource links, without copying the metadata in the Data Catalog or the data in Amazon Simple Storage Service (Amazon S3). See the API documentation for GetTable() and GetDatabase( ) for additional details.

article thumbnail

Data governance beyond SDX: Adding third party assets to Apache Atlas

Cloudera

In this blog, we’ll highlight the key CDP aspects that provide data governance and lineage and show how they can be extended to incorporate metadata for non-CDP systems from across the enterprise. Atlas provides open metadata management and governance capabilities to build a catalog of all assets, and also classify and govern these assets.

article thumbnail

A Reference Architecture for the Cloudera Private Cloud Base Data Platform

Cloudera

Further information and documentation [link] . All three will be quorums of Zookeepers and HDFS Journal nodes to track changes to HDFS Metadata stored on the Namenodes. CDP is particularly sensitive to host name resolution, therefore it’s vital that the DNS servers have been properly configured and hostnames are fully qualified.

article thumbnail

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

Instead of having a central team that manages all the data for a company, the thinking is that the responsibility of generating, curating, documenting, updating, and managing data should be distributed across the company based on whichever team is best suited to produce and own that data. Data mesh conceptual hierarchy. Miro: [link].