Generate security insights from Amazon Security Lake data using Amazon OpenSearch Ingestion

Amazon Security Lake centralizes access and management of your security data by aggregating security event logs from AWS environments, other cloud providers, on premise infrastructure, and other software as a service (SaaS) solutions. By converting logs and events using Open Cybersecurity Schema Framework, an open standard for storing security events in a common and shareable format, Security Lake optimizes and normalizes your security data for analysis using your preferred analytics tool.

Amazon OpenSearch Service continues to be a tool of choice by many enterprises for searching and analyzing large volume of security data. In this post, we show you how to ingest and query Amazon Security Lake data with Amazon OpenSearch Ingestion, a serverless, fully managed data collector with configurable ingestion pipelines. Using OpenSearch Ingestion to ingest data into your OpenSearch Service cluster, you can derive insights quicker for time sensitive security investigations. You can respond swiftly to security incidents, helping you protect your business critical data and systems.

Solution overview

The following architecture outlines the flow of data from Security Lake to OpenSearch Service.

The workflow contains the following steps:

Security Lake persists OCSF schema normalized data in an Amazon Simple Storage Service (Amazon S3) bucket determined by the administrator.
Security Lake notifies subscribers through the chosen subscription method, in this case Amazon Simple Queue Service (Amazon SQS).
OpenSearch Ingestion registers as a subscriber to get the necessary context information.
OpenSearch Ingestion reads Parquet formatted security data from the Security Lake managed Amazon S3 bucket and transforms the security logs into JSON documents.
OpenSearch Ingestion ingests this OCSF compliant data into OpenSearch Service.
Download and import provided dashboards to analyze and gain quick insights into the security data.

OpenSearch Ingestion provides a serverless ingestion framework to easily ingest Security Lake data into OpenSearch Service with just a few clicks.

Prerequisites

Complete the following prerequisite steps:

Create an Amazon OpenSearch Service domain. For instructions, refer to Creating and managing Amazon OpenSearch Service domains.
You must have access to the AWS account in which you wish to set up this solution.

Set up Amazon Security Lake

In this section, we present the steps to set up Amazon Security Lake, which includes enabling the service and creating a subscriber.

Enable Amazon Security Lake

Identify the account in which you want to activate Amazon Security Lake. Note that for accounts that are part of organizations, you have to designate a delegated Security Lake administrator from your management account. For instructions, refer to Managing multiple accounts with AWS Organizations.

Sign in to the AWS Management Console using the credentials of the delegated account.
On the Amazon Security Lake console, choose your preferred Region, then choose Get started.

Amazon Security Lake collects log and event data from a variety of sources and across your AWS accounts and Regions.

Now you’re ready to enable Amazon Security Lake.

You can either select All log and event sources or choose specific logs by selecting Specific log and event sources.
Data is ingested from all Regions. The recommendation is to select All supported regions so activities are logged for accounts that you might not frequently use as well. However, you also have the option to select Specific Regions.
For Select accounts, you can select the accounts in which you want Amazon Security Lake enabled. For this post, we select All accounts.

You’re prompted to either create a new AWS Identity and Access Management (IAM) role or use an existing IAM role. This gives required permissions to Amazon Security Lake to collect the logs and events. Choose the option appropriate for your situation.
Choose Next.
Optionally, specify the Amazon S3 storage class for the data in Amazon Security Lake. For more information, refer to Lifecycle management in Security Lake.
Choose Next.
Review the details and create the data lake.

Create an Amazon Security Lake subscriber

To access and consume data in your Security Lake managed Amazon S3 buckets, you must set up a subscriber.

Complete the following steps to create your subscriber:

On the Amazon Security Lake console, choose Summary in the navigation pane.

Here, you can see the number of Regions selected.

Choose Create subscriber.

A subscriber consumes logs and events from Amazon Security Lake. In this case, the subscriber is OpenSearch Ingestion, which consumes security data and ingests it into OpenSearch Service.

For Subscriber name, enter OpenSearchIngestion.
Enter a description.
Region is automatically populated based on the current selected Region.
For Log and event sources, select whether the subscriber is authorized to consume all log and event sources or specific log and event sources.
For Data access method, select S3.
For Subscriber credentials, enter the subscriber’s <AWS account ID> and OpenSearchIngestion-<AWS account ID>.
For Notification details, select SQS queue.

This prompts Amazon Security Lake to create an SQS queue that the subscriber can poll for object notifications.

Choose Create.

Install templates and dashboards for Amazon Security Lake data

Your subscriber for OpenSearch Ingestion is now ready. Before you configure OpenSearch Ingestion to process the security data, let’s configure an OpenSearch sink (destination to write data) with index templates and dashboards.

Index templates are predefined mappings for security data that selects the correct OpenSearch field types for corresponding Open Cybersecurity Schema Framework (OCSF) schema definition. In addition, index templates also contain index-specific settings for a particular index patterns. OCSF classifies security data into different categories such as system activity, findings, identity and access management, network activity, application activity and discovery.

Amazon Security Lake publishes events from four different AWS sources: AWS CloudTrail with subsets for AWS Lambda and Amazon Simple Storage Service (Amazon S3), Amazon Virtual Private Cloud(Amazon VPC) Flow Logs, Amazon Route 53, and AWS Security Hub. The following table details the event sources and their corresponding OCSF categories and OpenSearch index templates.

Amazon Security Lake Source	OCSF Class ID	OpenSearch Index Pattern
CloudTrail (Lambda and Amazon S3 API subsets)	3005	ocsf-3005*
VPC Flow Logs	4001	ocsf-4001*
Route 53	4003	ocsf-4003*
Security Hub	2001	ocsf-2001*

To easily identify OpenSearch indices containing Security Lake data, we recommend following a structured index naming pattern that includes the log category and its OCSF defined class in the name of the index. An example is provided below

ocsf-cuid-${/class_uid}-${/metadata/product/name}-${/class_name}-%{yyyy.MM.dd}

Complete the following steps to install the index templates and dashboards for your data:

Download the component_templates.zip and index_templates.zip files and unzip them on your local device.

Component templates are composable modules with settings, mappings, and aliases that can be shared and used by index templates.

Upload the component templates before the index templates. For example, the following Linux command line shows how to use the OpenSearch _component_template API to upload to your OpenSearch Service domain (change the domain URL and the credentials to appropriate values for your environment):
```
ls component_templates | awk -F'_body' '{print $1}' | xargs -I{} curl  -u adminuser:password -X PUT -H 'Content-Type: application/json' -d @component_templates/{}_body.json https://my-opensearch-domain.es.amazonaws.com/_component_template/{}
```

Once the component templates are successfully uploaded, proceed to upload the index templates:

ls index_templates | awk -F'_body' '{print $1}' | xargs -I{} curl  -uadminuser:password -X PUT -H 'Content-Type: application/json' -d @index_templates/{}_body.json https://my-opensearch-domain.es.amazonaws.com/_index_template/{}

Verify whether the index templates and component templates are uploaded successfully, by navigating to OpenSearch Dashboards, choose the hamburger menu, then choose Index Management.

In the navigation pane, choose Templates to see all the OCSF index templates.

Choose Component templates to verify the OCSF component templates.

After successfully uploading the templates, download the pre-built dashboards and other components required to visualize the Security Lake data in OpenSearch indices.
To upload these to OpenSearch Dashboards, choose the hamburger menu, and under Management, choose Stack Management.
In the navigation pane, choose Saved Objects.

Choose Import.

Choose Import, navigate to the downloaded file, then choose Import.

Confirm the dashboard objects are imported correctly, then choose Done.

All the necessary index and component templates, index patterns, visualizations, and dashboards are now successfully installed.

Configure OpenSearch Ingestion

Each OpenSearch Ingestion pipeline will have a single data source with one or more sub-pipelines, processors, and sink. In our solution, Security Lake managed Amazon S3 is the source and your OpenSearch Service cluster is the sink. Before setting up OpenSearch Ingestion, you need to create the following IAM roles and set up the required permissions:

Pipeline role – Defines permissions to read from Amazon Security Lake and write to the OpenSearch Service domain
Management role – Defines permission to allow the user to create, update, delete, validate the pipeline and perform other management operations

The following figure shows the permissions and roles you need and how they interact with the solution services.

Before you create an OpenSearch Ingestion pipeline, the principal or the user creating the pipeline must have permissions to perform management actions on a pipeline (create, update, list, and validate). Additionally, the principal must have permission to pass the pipeline role to OpenSearch Ingestion. If you are performing these operations as a non-administrator, add the following permissions to the user creating the pipelines:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Resource": "*",
			"Action":[
            	"osis:CreatePipeline",
            	"osis:ListPipelineBlueprints",
            	"osis:ValidatePipeline",
            	"osis:UpdatePipeline",
            	"osis:GetPipelineBlueprint",
            	"osis:GetPipeline",
            	"osis:ListPipelines",
            	"osis:GetPipelineChangeProgress"            
	         ]
		},
		{
			"_comment": "Replace {your-account-id} with your AWS account ID",
			"Resource": [
				"arn:aws:iam::{your-account-id}:role/pipeline-role"
			],
			"Effect": "Allow",
			"Action": [
				"iam:PassRole"
			]
		}
	]
}

Configure a read policy for the pipeline role

Security Lake subscribers only have access to the source data in the Region you selected when you created the subscriber. To give a subscriber access to data from multiple Regions, refer to Managing multiple Regions. To create a policy for read permissions, you need the name of the Amazon S3 bucket and the Amazon SQS queue created by Security Lake.

Complete the following steps to configure a read policy for the pipeline role:

On the Security Lake console, choose Regions in the navigation pane.
Choose the S3 location corresponding to the Region of the subscriber you created.

Make a note of this Amazon S3 bucket name.

Choose Subscribers in the navigation pane.
Choose the subscriber OpenSearchIngestion that you created earlier.

Take note of the Amazon SQS queue ARN under Subscription endpoint.

On the IAM console, choose Policies in the navigation pane.
Choose Create policy.
In the Specify permissions section, choose JSON to open the policy editor.

Remove the default policy and enter the following code (replace the S3 bucket and SQS queue ARN with the corresponding values):

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "ReadFromS3",
			"Effect": "Allow",
			"Action": "s3:GetObject",
			"Resource": "arn:aws:s3:::{bucket-name}/*"
		},
		{
			"Sid": "ReceiveAndDeleteSqsMessages",
			"Effect": "Allow",
			"Action": [
				"sqs:DeleteMessage",
				"sqs:ReceiveMessage"
			],
			"_comment": "Replace {your-account-id} with your AWS account ID",
			"Resource": "arn:aws:sqs:{region}:{your-account-id}:{sqs-queue-name}"
		}
	]
}

Choose Next.
For policy name, enter read-from-securitylake.
Choose Create policy.

You have successfully created the policy to read data from Security Lake and receive and delete messages from the Amazon SQS queue.

The complete process is shown below.

Configure a write policy for the pipeline role

We recommend using fine-grained access control (FGAC) with OpenSearch Service. When you use FGAC, you don’t have to use a domain access policy; you can skip the rest of this section and proceed to creating your pipeline role with the necessary permissions. If you use a domain access policy, you need to create a second policy (for this post, we call it write-to-opensearch) as an added step to the steps in the previous section. Use the following policy code:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "es:DescribeDomain",
			"Resource": "arn:aws:es:*:{your-account-id}:domain/*"
		},
		{
			"Effect": "Allow",
			"Action": "es:ESHttp*",
			"Resource": "arn:aws:es:*:{your-account-id}:domain/{domain-name}/*"
		}
	]
}

If the configured role has permissions to access Amazon S3 and Amazon SQS across accounts, OpenSearch Ingestion can ingest data across accounts.

Create the pipeline role with necessary permissions

Now that you have created the policies, you can create the pipeline role. Complete the following steps:

On the IAM console, choose Roles in the navigation pane.
Choose Create role.
For Use cases for other AWS services, select OpenSearch Ingestion pipelines.
Choose Next.
Search for and select the policy read-from-securitylake.
Search for and select the policy write-to-opensearch (if you’re using a domain access policy).
Choose Next.
For Role Name, enter pipeline-role.
Choose Create.

Keep note of the role name; you will be using it while configuring opensearch-pipeline.

Now you can map the pipeline role to an OpenSearch backend role if you’re using FGAC. You can map the ingestion role to one of predefined roles or create your own with necessary permissions. For example, all_access is a built-in role that grants administrative permission to all OpenSearch functions. When deploying to a production environment, make sure to use a role with just enough permissions to write to your Amazon OpenSearch Service domain.

Create the OpenSearch Ingestion pipeline

In this section, you use the pipeline role you created to create an OpenSearch Ingestion pipeline. Complete the following steps:

On the OpenSearch Service console, choose OpenSearch Ingestion in the navigation pane.
Choose Create pipeline.
For Pipeline name, enter a name, such as security-lake-osi.
In the Pipeline configuration section, choose Configuration blueprints and choose AWS-SecurityLakeS3ParquetOCSFPipeline.

Under source, update the following information:
1. Update the queue_url in the sqs section. (This is the SQS queue that Amazon Security Lake created when you created a subscriber. To get the URL, navigate to the Amazon SQS console and look for the queue ARN created with the format AmazonSecurityLake-abcde-Main-Queue.)
2. Enter the Region to use for aws credentials.
3. For sts_role_arn, enter the ARN of pipeline-role.

Under sink, update the following information:
1. Replace the hosts value in the OpenSearch section with the Amazon OpenSearch Service domain endpoint.
2. For sts_role_arn, enter the ARN of pipeline-role.
3. Set region as us-east-1.
4. For index, enter the index name that was defined in the template created in the previous section ("ocsf-cuid-${/class_uid}-${/metadata/product/name}-${/class_name}-%{yyyy.MM.dd}").
Choose Validate pipeline to verify the pipeline configuration.

If the configuration is valid, a successful validation message appears; you can now proceed to the next steps.

Under Network, select Public for this post. Our recommendation is to select VPC access for an inherent layer of security.
Choose Next.
Review the details and create the pipeline.

When the pipeline is active, you should see the security data ingested into your Amazon OpenSearch Service domain.

Visualize the security data

After OpenSearch Ingestion starts writing your data into your OpenSearch Service domain, you should be able to visualize the data using the pre-built dashboards you imported earlier. Navigate to dashboards and choose any one of the installed dashboards.

For example, choosing DNS Activity will give you dashboards of all DNS activity published in Amazon Security Lake.

This dashboard shows the top DNS queries by account and hostname. It also shows the number of queries per account. OpenSearch Dashboards are flexible; you can add, delete, or update any of these visualizations to suit your organization and business needs.

Clean up

To avoid unwanted charges, delete the OpenSearch Service domain and OpenSearch Ingestion pipeline, and disable Amazon Security Lake.

Conclusion

In this post, you successfully configured Amazon Security Lake to send security data from different sources to OpenSearch Service through serverless OpenSearch Ingestion. You installed pre-built templates and dashboards to quickly get insights from the security data. Refer to Amazon OpenSearch Ingestion to find additional sources from which you can ingest data. For additional use cases, refer to Use cases for Amazon OpenSearch Ingestion.

About the authors

Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search applications and solutions. Muthu is interested in the topics of networking and security, and is based out of Austin, Texas.

Aish Gunasekar is a Specialist Solutions architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.

Jimish Shah is a Senior Product Manager at AWS with 15+ years of experience bringing products to market in log analytics, cybersecurity, and IP video streaming. He’s passionate about launching products that offer delightful customer experiences, and solve complex customer problems. In his free time, he enjoys exploring cafes, hiking, and taking long walks.