Solving the Insurance Industry’s Data Quality Problem

Using data to inform business decisions only works when the data is correct. Unfortunately for the insurance industry’s data leaders, many data sources are riddled with inaccuracies

Data is the lifeblood of the insurance industry. From telematics data in car insurance to geospatial data in the property sector and beyond – accurate and timely data empowers insurance companies to effectively assess and manage their portfolios of risks.

However, even the best data sources aren’t correct 100% of the time, and this can create a misleading picture of the exposures, risks and perils that should be associated with any given policy.

This presents a challenge for the industry’s data leaders, as guests at our Risk Intelligence and Realizing Risk roundtables in New York and London were keen to point out.

“We try to gather as much information as possible around risk at every location” – Shajy Mathai, Analytics Technical Officer, AIG

“The greatest challenge facing [our industry’s data leaders is the quality and the integrity of data,” says Chris Wyard, Head of Technical Data at Allianz Insurance. “A significant contributor to that will be which external and enrichment data sources are being used.”

Basing risk assessments on false information could lead to significant losses over the long-term. As such, the sheer volume of overlapping data sources that exist today raise a key question for insurance industry data leaders:

Which ones are best?

Insurers Are Concerned About Poor Data Quality

Few of the guests at our two insurance industry roundtable events wanted to talk in depth about their organizations’ data maturity levels. But our attendee surveys show that data quality is a major concern for insurers and reinsurers alike.

None of our guests are certain they’re using the most accurate data to assess and price risk, and just 24% are ‘very confident’ that they are. Most describe themselves as ‘fairly confident’ about the quality of their data, while 35% say they’re ‘not very confident’. Some attendees also cited the timeliness of their data as a key concern. Outdated data is no good for assessing potential risks or perils.

This is particularly concerning when you consider that 82% of the attendees say accurate location-based data is at least ‘very important’ to their business operations, with 53% saying it’s ‘absolutely critical’.

“We increasingly look at ‘location’ as a master attribute in the context of our organization,” says Wyard. “We want all of our functions and all uses of our data to be aligned to that understanding of the location, so that we drive consistency in terms of the interactions we have with customers.”

“One of the major challenges I see quite often within organizations is that they have a lot of data and data gathered all over the place,” adds Alan Luu, AVP of Advanced Digital Analytics at Chubb. “They don't have to ways to consolidate, validate and centralize the data. As a result, most of the time they can’t take advantage of all the data they have in their system.”

Finding Certainty in the Uncertain with Data

Given that no one data source is 100% reliable, some insurers have taken to acquiring as many different sources as they can get their hands on.

“Not all data are accurate,” explains Luu. “So, you buy the same piece of information from four or five different vendors.”

“We try to gather as much information as possible around risk at every location,” adds Shajy Mathai, Analytics Technical Officer at AIG. “The biggest challenge we find is that the cost of curation is extremely high. It's part of our cost in doing business, but it's challenging.”

These insurers will then use their various data sources to validate each other. For example, if all sources of location data agree that a property has the same geocode, number of floors, number of occupants and so on, it’s safe to conclude that they’re right.

In cases where there are contradictions in the data, an insurer must weigh up the quality of each data source and decide what’s most likely to be the case. They can then assign a ‘certainty score’ to that datapoint that reflects the level of disagreement in the data.

“The biggest challenge we find is that the cost of curation is extremely high. It's part of our cost in doing business, but it's challenging” – Shajy Mathai, Analytics Technical Officer, AIG

“Somebody on the ground has to make a decision whether ‘this’ address in Thailand already exists in our system,” Mathai says. “We do this using complex matching algorithms based on geocoding data, etc.”

At the same time, insurance industry data leaders combine this analysis of external data sources with their own internal records to curate their own ‘master’ datasets.

This process of eliminating false data and curating validated information is essential for any insurer that wants to automate their business processes. However, not everyone has the same resources to spend developing this master dataset as the likes of AIG and Chubb.

“We happen to be big and we are also committed to digital transformation,” Luu concludes. “I'm not sure if every other company in the insurance industry would have enough budget to obtain the same data.”

Data leaders at these companies must make tough decisions about which investments provide the most accurate view of risk, for the best value.

This is an excerpt from our Future of Insurance Data report, published in association with Pitney Bowes Software. Claim your copy today for even more insights into how AI is transforming the insurance industry.