AWS Big Data Blog

How FIS ingests and searches vector data for quick ticket resolution with Amazon OpenSearch Service

This post was co-written by Sheel Saket, Senior Data Science Manager at FIS, and Rupesh Tiwari, Senior Architect at Amazon Web Services.

Do you ever find yourself grappling with multiple defect logging mechanisms, scattered project management tools, and fragmented software development platforms? Have you experienced the frustration of lacking a unified view, hindering your ability to efficiently manage and identify common trending issues within your enterprise? Are you constantly facing challenges when it comes to addressing defects and their impact, causing disruptions in your production cycles?

If these questions resonate with you, then you’re not alone. FIS, a leading technology and services provider, has encountered these very challenges. In their quest for a solution, they teamed up with AWS to tackle these obstacles head-on. In this post, we take you on a journey through their collaborative project, exploring how they used Amazon OpenSearch Service to transform their operations, enhance efficiency, and gain valuable insights.

This post shares FIS’s journey in overcoming challenges and provides step-by-step instructions for provisioning the solution architecture in your AWS account. You’ll learn how to implement a transformative solution that empowers your organization with near-real-time data indexing and visualization capabilities.

In the following sections, we dive into the details of FIS’s journey and discover how they overcame these challenges, revolutionizing their approach to defect management and software development.

Challenges for near-real-time ticket visualization and search

FIS faced several challenges in achieving near-real-time ticket visualization and search capabilities, including the
following:

  • Integrating ticket data from tens of different third-party systems
  • Overcoming API call thresholds and limitations from various systems
  • Implementing an efficient KNN vector search algorithm for resolving issues and performing trend analysis
  • Establishing a robust data ingestion and indexing process for real-time updates from 15,000 tickets per day
  • Ensuring unified access to ticket information across 20 development teams
  • Providing secure and scalable access to ticket data for up to 250 teams

Despite these challenges, FIS successfully enhanced their operational efficiency, enabled quick ticket resolution, and gained valuable insights through the integration of OpenSearch Service.

Let’s delve into the technical walkthrough of the architecture diagram and mechanisms. The following section provides step-by-step instructions for provisioning and implementing the solution on your AWS Management Console, along with a helpful video tutorial.

Solution overview

The architecture diagram of FIS’s near-real-time data indexing and visualization solution incorporates various AWS services for specific functions. The solution uses GitHub as the data source, employs Amazon Simple Storage Service (Amazon S3) for scalable storage, manages APIs with Amazon API Gateway, performs serverless computing using AWS Lambda, and facilitates data streaming and ETL (extract, transform, and load) processes through Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose. OpenSearch Service is employed for analytics and application monitoring. This architecture ensures a robust and scalable solution, enabling FIS to efficiently index and visualize data in near-real time. With these AWS services, FIS effectively manages their data pipeline and gains valuable insights for their business processes.

The following diagram illustrates the solution architecture.

Architecture Diagram

The workflow includes the following steps:

  1. GitHub webhook events stream data to both Amazon S3 and OpenSearch
    Service, facilitating real-time data analysis.
  2. A Lambda function connects to an API Gateway REST API, processing and structuring the received payloads.
  3. The Lambda function adds the structured data to a Kinesis data stream, enabling immediate data streaming and quick ticket insights.
  4. Kinesis Data Firehose streams the records from the Kinesis data stream to an S3 bucket, simultaneously creating an index in OpenSearch Service.
  5. OpenSearch Service uses the indexed data to provide near-real-time visualization and enable efficient ticket analysis through K-Nearest Neighbor (KNN) search, enhancing productivity and optimizing data operations.

The following sections provide step-by-step instructions for setting up the solution. Additionally, we have created a video guide that demonstrates each step in detail. You are welcome to watch the video and follow along with this post if you prefer.

Prerequisites

You should have the following prerequisites:

Implement the solution

Complete the following steps to implement the solution:

  1. Create an OpenSearch Service domain.
  2. Create an S3 bucket named git-data.
  3. Create a Kinesis data stream named git-data-stream.
  4. Create a Firehose delivery stream named git-data-delivery-stream with
    git-data-stream as the source and git-data as the destination, and a buffer interval of 60 seconds.
  5. Create a Lambda function named git-webhook-handler with a timeout of 5 minutes. Add code to add data to the Kinesis data stream.
  6. Grant the Lambda function’s execution role permission to put_record on the Kinesis data stream.
  7. Create a REST API in API Gateway named git-webhook-handler-api. Create a resource named
    git-data with a POST method, integrate it with the Lambda function git-webhook-handler created in the previous step, and deploy the REST API.
  8. Create a delivery stream with the Kinesis data stream as the source and OpenSearch Service as the destination. Provide the AWS Identity and Access Management (IAM) role for Kinesis Data Firehose with the necessary permissions to create an index in OpenSearch Service. Finally, add the IAM role as a backend service in OpenSearch Service.
  9. Navigate to your GitHub repository and create a webhook to enable seamless integration with the solution. Copy the REST API URL and enter this newly created webhook.

Test the solution

To test the solution, complete the following steps:

  1. Go to your GitHub repository and choose the Star button, and verify that you receive a response with a status code of 200.
  2. Also, check for the ShardId and SequenceNumber in the recent deliveries to confirm successful event addition to the Kinesis data stream.

Kinesis data stream

  1. On the Kinesis console, use the Data Viewer to confirm the arrival of data records.

kinesis record data

  1. Navigate to the OpenSearch Dashboard and choose the dev tool.
  2. Search for the records and observe that all the Git events are displayed
    in the result pane.

opensearch devtool

  1. On the Amazon S3 console, open the bucket and view the data records.

s3 bucket records

Security

We adhere to IAM best practices to uphold security:

  1. Craft a Lambda execution role for read/write operations on the Kinesis data stream.
  2. Generate an IAM role for Kinesis Data Firehose to manage Amazon S3 and OpenSearch
    Service access.
  3. Link this IAM role in OpenSearch Service security to confer backend user privileges.

Clean up

To avoid incurring future charges, delete all the resources you created.

Benefits of near-real-time ticket visualization and search

During our demonstration, we showcased the utilization of GitHub as the streaming data source. However, it’s important to note that the solution we presented has the flexibility to scale and incorporate multiple data sources from various services. This allows for the consolidation and visualization of diverse data in near-real time, using the capabilities of OpenSearch Service.

With the implementation of the solution described in this post, FIS effectively overcame all the challenges they faced.

In this section, we delve into the details of the challenges and benefits they achieved:

  • Integrating ticket data from multiple third-party systems – Near-real-time data streaming ensures an up-to-date information flow from third-party providers for timely insights
  • Overcoming API call thresholds and limitations imposed by different systems – Unrestricted data flow with no threshold or rate limiting enables seamless integration and continuous updates
  • Accommodating scalability requirements for up to 250 teams – The asynchronous, serverless architecture effortlessly scales more than 250 times larger without infrastructure modifications
  • Efficiently resolving tickets and performing trend analysis – OpenSearch Service semantic KNN search identifies duplicates and defects, and optimizes operations for improved efficiency
  • Gaining valuable insights for business processes – Artificial intelligence (AI) and machine
    learning (ML) analytics use the data stored in the S3 bucket, empowering deeper insights and informed decision-making
  • Ensuring secure access to ticket data and regulatory compliance – Secure data access and compliance with data protection regulations ensure data privacy and regulatory compliance

Conclusion

FIS, in collaboration with AWS, successfully addressed several challenges to achieve near-real-time ticket visualization and search capabilities. With OpenSearch Service, FIS enhanced operational efficiency by efficiently resolving ticketsand performing trend analysis. With their data ingestion and indexing process, FIS processed 15,000 tickets per day in real time. The solution provided secure and scalable access to ticket data for more than 250 teams, enabling unified collaboration. FIS experienced a remarkable 30% reduction in ticket resolution time, empowering teams to quickly address
issues.

As Sheel Saket, Senior Data Science Manager at FIS, states, “Our near-real-time solution transformed how we identify and resolve tickets, improving our overall productivity.”

Furthermore, organizations can further improve the solution by adopting Amazon OpenSearch Ingestion for data ingestion, which offers cost savings and out-of-the-box data processing capabilities. By embracing this transformative solution, organizations can optimize their ticket management, drive productivity, and deliver exceptional experiences to customers.

Want to know more? You can reach out to FIS from their official FIS contact page, follow FIS Twitter, and visit the FIS LinkedIn page.


About the Author

Rupesh Tiwari is a Senior Solutions Architect at AWS in New York City, with a focus on Financial Services. He has over 18 years of IT experience in the finance, insurance, and education domains, and specializes in architecting large-scale applications and cloud-native big data workloads. In his spare time, Rupesh enjoys singing karaoke, watching comedy TV series, and creating joyful moments with his family.

Sheel Saket is a Senior Data Science Manager at FIS in Chicago, Illinois. He has over 11 years of IT experience in the finance, insurance, and e-commerce domains, and specializes in architecting large-scale AI solutions and cloud MLOps. In his spare time, Sheel enjoys listening to audiobooks, podcasts, and watching movies with his family.