AI Is All the Rage Again – It’s Still About the Data
Breakthroughs in Chat-GPT and other generative artificial intelligence have thrust AI back onto the front pages. Newspapers, magazines, and network newscasts all feature stories on the potential positive and negative impacts of these new technologies on everything from schoolwork to the future structure of jobs in the global economy. As is typical, when popular media features new technology, they get some right and some wrong. Chat-GPT (I will use this term in a generic context to refer to all generative AI) is the new “magic wand” that can transform anything it is waived at. We’ve seen this many times before, and the eventuality of all the hype has disappointed us in many cases. One thing remains universal, though, AI is still about the data. And while Chat-GPT can make sense of structured, and especially unstructured, data better than many AI technologies, good results require good input, and that is data.
One of Chat-GPT’s challenges is telling the truth. It tends to make things up, especially when it doesn’t have a sound fact base to answer any specific question. This raises many questions about its usefulness in many situations. If you are writing an essay, it’s relatively simple to fact-check what Chat-GPT generates. Foolish is the writer that accepts and uses its raw output, no matter how factual it sounds. But real trouble can result in a use case where automated decisions may be made based on Chat-GPT’s output.
Let’s focus on casualty claim handling. Much is being written about how Chat-GPT will revolutionize the claim process by assisting adjusters in menial tasks and automating others. I believe it will, but carriers and others that deploy it in this context need to build safeguards against bad decisions based on faulty output. Building defenses against undesirable outcomes starts at the ground level, with the data a GPT tool is trained on. And at the most basic level of providing sufficient “good” data for training is managed data quality.
High-quality claim data doesn’t just happen; it results from a multi-step process that starts with well-defined and enforced data quality standards. A collective of claims, data science, data management, and IT individuals need to define what level of quality the organization needs to maintain to use data in the manner it wishes to.
Data collection and entry processes need to ensure this level of data quality. This is where most carriers have fallen in the past. And they are paying a hefty price today because of it. Most carriers I am familiar with are caught between what they want to do with their data and what it will support.
Once data is collected, it needs to be validated. Are all fields filled in? Do they contain logical input? If not, what processes and procedures exist to correct non-valid information? Claim adjusters work under time pressures every day. Management needs the courage to add the extra time it takes them to be thorough and accurate so high-quality data is captured. It’s an added cost, but the benefits from high-quality analytics output outweigh the price, in my experience.
Hopefully, data entry and validation are sufficient to create nearly perfect data, but this is impractical. Thus, data cleansing becomes a necessary part of the quality management process before data can be effectively integrated and analyzed. Modern data management tools can perform much of this cleansing by transforming null or out-of-bounds data elements into values that algorithms can use.
Finally, these processes need to be monitored, reviewed, and adjusted regularly as part of an ongoing data quality management process. Dashboards and reports should be created to allow all relevant participants to see how the process is working. As mentioned earlier, many constituencies are involved in creating and maintaining high-quality data. Each area has some accountability and thus should have a view into how the process is proceeding and where problems exist when they occur.
I am very excited about Chat-GPT’s prospects to transform claim handling. But don’t forget that the “garbage in, garbage out” rule still applies. As I’ve written, it takes a village to attain and maintain high-quality data processes. This stresses the importance of working with partners like CLARA Analytics who can help standardize your data, but also provide a wealth of data from an industry data lake to ensure model accuracy and gain massive returns. Learn more about how CLARA works here.
Note: The above mentioned process applies to structured data, but Chat-GPT is especially effective in analyzing and utilizing unstructured data. The key to using unstructured data lies not in managing its quality but in its availability to analytics systems and the metadata associated with it so it can always be associated with the underlying claim. That’s a topic for a future blog.