Five Key Factors for Healthcare Buyers When Selecting an NLP Solution
By Calum Yacoubian, MD
Healthcare unceasingly produces massive amounts of data, driving industry leaders to reach an inevitable conclusion: In order to ensure that the best care can be delivered, payers and providers must turn to artificial intelligence (AI) to perform data analysis, the need for which has quickly exceeded human capacity.
Consider a few industry statistics that illustrate the depth of healthcare’s data challenges. Today, approximately 30% of the world’s data volume is generated by healthcare, according to RBC Capital Markets. By 2025, RBC states, the compound annual growth rate of data for healthcare will reach 36%—a rate that is 6%–11% faster than manufacturing, financial services, and media and entertainment. The typical hospital generates 50 petabytes of data per year—equivalent to about 11,000 4K movies. Caring for a single patient over one year, meanwhile, generates up to 80 megabytes of imaging and electronic health record (EHR) data.
Unsurprisingly, more healthcare organizations are realizing that clinicians and researchers simply do not have time to manually code and prepare data to capture features of interest. They are instead looking to AI-powered technologies such as natural language processing (NLP) to analyze and extract that data, yielding insights that drive better patient care, lower costs, and stronger operational performance.
What NLP can do for healthcare
Healthcare organizations need accurate, comprehensive patient data to guide decision-making around virtually all clinical and operational functions, including care coordination, risk adjustment, patient outreach, payment integrity, and performance in value-based contracts. Healthcare data presents a challenge, though, because it is both tremendously voluminous and largely unstructured.
Unstructured data is free-text information entered by clinicians that generally appears in the “notes” sections of EHRs and is not easily analyzed. Patient health is so complex that no matter how many check boxes or drop-down lists the EHR contains, they will never capture the nuance and subtlety of physician history-taking and documentation—and even if they did, the time taken to manually enter all of these details would only worsen already-prevalent physician burnout. An estimated 80% of healthcare data remains unstructured and untapped after creation.
Traditional methods of capturing unstructured data have generally involved chart reviews, which require clinical employees to manually read through patient records to seek out valuable pieces of data. In recent years, however, the industry has moved toward AI-driven technologies such as NLP to overcome the drawbacks of manual chart reviews.
NLP automates text mining, helping machines “read” text by simulating the human ability to understand languages. This enables the analysis of unlimited amounts of text-based data without some of the limitations inherent to humans, such as fatigue and bias. Furthermore, by having a machine process the data and only present the relevant information and insights to human reviewers, patient data unrelated to a given use case is not surfaced, thus enhancing privacy protections. Essentially, NLP allows computers to understand the nuanced meanings of clinical language within a medical record, such as identifying the differences between:
- A patient who is a smoker
- A patient who says she quit smoking five years ago
- A patient who says she is trying to quit smoking
Healthcare organizations must understand these differences because they use these nuggets of data to feed predictive models that can enhance patient treatment, identify gaps in care, and improve risk adjustment. As the industry has begun to realize more value from the insights produced by NLP, most organizations are asking “how” rather than “if” they should use the technology.
Not all NLP is created equal
NLP options have proliferated in the healthcare market, resulting in a sometimes-overwhelming array of choices for prospective buyers. Solutions that extract value and insights from free text come in many shapes and sizes, so having a checklist to guide decision-making is essential. Following are five key areas for consideration when choosing an NLP tool:
- Healthcare domain expertise: Like other industries and fields of study, healthcare comes with its own language and vocabulary. Therefore, healthcare organizations must select an NLP solution that has been proven to work efficiently with healthcare data. This caveat applies not only to the clinical contents of the healthcare documents, but also to the structure and format of the documents, which provide additional context.
- Flexibility of use cases: NLP tools must allow flexibility of use to ensure value throughout a combination of proven use cases (such as risk adjustment) while also being customizable for new use cases, such as population health and predictive analytics. With the pandemic accelerating the adoption of cloud computing, NLP solutions accessible through cloud application programming interfaces have gained popularity. Cloud services offer healthcare organizations a way to quickly access the data buried in free text, thus beginning to obtain value from NLP, without any infrastructure costs. Cloud solutions also offer customization and customer-specific NLP endpoints, as they combine convenience with flexibility of use cases.
- Open NLP pipeline: As NLP use has become more widespread in the industry, healthcare organizations are hiring more data science and text mining experts. Organizations beginning to leverage NLP more are best served with an enterprisewide strategy for deploying the technology. As a result, the technology platform they select should incorporate their teams’ expertise and preferred tools, allowing teams to use the information systems and workflows they have grown comfortable with. This is referred to as an “open NLP pipeline.”
- Transparency: Being able to repeat and understand the output of NLP should be a high priority, as many healthcare organizations already have a trusting customer base. Therefore, the NLP solution they adopt should not be a black box; rather, its output should be traceable and repeatable so that it can be trusted.
- Scalability: Much of NLP’s transformative capability rests on its quick analysis of substantial amounts of data, which increases efficiency by significantly reducing manual clinical reviews. Consequently, NLP solutions must be able to swiftly process millions of documents to scale reliably with an organization’s text-mining needs.
In the coming years and decades, healthcare is likely to grow exponentially, requiring healthcare organizations to look beyond the human capacity for mining medical records. For many organizations, NLP will enable the capture of insights that inform data-driven, evidence-based decision-making. No matter which platform NLP shoppers choose, they should do their homework and keep in mind the above considerations.
Calum Yacoubian, MD, is associate director of healthcare strategy at Linguamatics, an IQVIA company.