Automation Complacency: Avoiding the Pitfalls of AI Integration
By Matt Phillion
Health systems continue to accelerate their use of AI tools, from documentation and clinical support to patient triage. But as this growth continues, there’s one emerging risk going nearly unaddressed in the industry: automation complacency. Much of the discussion remains focused on accuracy, bias, and regulation, but what actually happens once clinicians start using AI tools every day?
Parallels exist to another long-recognized issue of alert fatigue. Similar to how clinicians can and often do become desensitized or overwhelmed by constant, non-critical alerts, a steady stream of AI outputs can lead to a lack of meaningful scrutiny, even when those clinicians know AI can be fallible or non-deterministic. And just like alert fatigue, fading attention can have dangerous results.
As AI tools become more and more embedded in routine workflows, there is a psychological dimension to the human-machine interaction that becomes a patient safety issue, not a technological one.
Part of the reason we’re not yet talking about automation complacency is the novelty of the technology, says Ben Scharfe, executive vice president of AI with Altera Digital Health.
“It’s very new and very emergent, and as a companion to that there’s a lot of pressure on everyone, every organization, and every industry, to show ROI and to adopt AI—and to not let those pesky concerns get in the way,” says Scharfe. “It’s not necessarily something people are thinking about.”
Scharfe worries that in the rush to scale AI tools and embrace them rapidly, we’ll become too comfortable.
“Essentially, this accumulation of small errors will go undetected for some time, a cave of undiscovered errors, a whole lot of missed calibration, oversight, and downstream patient harm events,” he says. “Right now, the primary vector in healthcare is ambient listening, and we could have these nuanced inconsistencies, something not heard properly. And at mass scale, we have providers who aren’t scrutinizing the note outputs. It’ll look like a note, it will bill properly, but these nuanced inaccuracies that propagate through the patient notes will have a cascading effect over time but will accumulate while being very hard to detect.”
The road to addressing this is a “very unglamorous road with no silver bullet,” Scharfe explains: it’s largely a matter of awareness and ownership.
“One of the things that gives me comfort is that the de facto vehicle for addressing ownership is liability, and we’ve seen some states draft legislation that the provider cannot defer liability to an AI system. That’s an eye-opening reality,” he says. “Essentially, they can’t hang their hat on, well, the AI wrote it wrong. Ultimately, you are fully responsible for the note.”
The technology can be beneficial, saving time instead of getting bogged down on physical note taking and the grunt work of inputting notes into the record, but the trick here is to reinvest that time into improving the quality of care. That includes being comfortable with the output and taking ownership of that data.
One recommendation Scharfe has for hospitals and healthcare organizations rolling out more complex AI systems and tools is to use phased rollouts to you can compare data.
“Ambient listening is easy to wrap your head around, but with areas like triaging or potentially clinical decision-making support, it’s important you experiment with that rollout,” he says. “Have a parallel run where you can take real data and compare it to your normal processes to understand those potential pitfalls. Right now, there’s a lot of hype with vendors saying they have 99% accuracy, but that doesn’t convey anything to specific health systems or patient populations, especially when these models are trained on very large data sets, potentially nationally or internationally. It’s not necessarily representative of your patient communities.”
A phased rollout allows you to look at recommendations from the AI and how they compare to skilled providers and see if they consistently match to determine your level of confidence.
A visceral experience
How do we make sure clinicians know the limitations of the technology and their responsibility using it when they are bombarded by these new tools every day?
“I have an interesting suggestion on this regard. We can learn from other peoples’ mistakes, but we don’t always internalize the lessons unless we have a visceral, first-hand experience,” says Scharfe. “I don’t think we need a first-hand patient event, but we need to make the gut-punch real.”
For example, as you’re rolling the technology out and creating education for it, create very subtly incorrect AI outputs and include them as part of the training you’re giving to providers.
“Maybe you only have five minutes to review five outputs. You need to get them done at the end of the clinical day,” he says. “Have them review those outputs and if they miss the incorrect one, it shows them the subtle ways things can go wrong, especially in a fast-paced situation.”
The same can go for teaching about AI hallucinations, Scharfe says.
“Everyone once in a while you need to experience an AI very confidently hallucinating at you,” he says.
Both of these suggestions help remind everyone, including the clinicians, that they cannot and should not dismiss their own expertise.
“They are among the most educated corners of society. Those subtle, nuanced inaccuracies are really hard for the everyman to detect. But hopefully these systems are saving you time with frustrating tasks you can reinvest into ensuring the data is accurate,” says Scharfe.
Regulation and guidelines
Initially, under the last administration, there were proposed federal regulations that would provide sweeping guidance at the national level for using AI in healthcare, Scharfe notes.
“That legislation never materialized, and we’ve seen a change in favor of acceleration and not overly enforcing legislation,” he says.
This leaves us with a patchwork of legislation, which makes it difficult for vendors to build systems that can address national and international needs.
“These guidelines often contradict each other or compete with each other, which creates a complex legal and legislative landscape to navigate,” says Scharfe. “Adding to this is that it’s emergent, and there’s not a lot of case law written. There will be landmark cases yet to come.”
The risks, legal and otherwise, vary widely depending on how organizations are using this technology and what it is used for. Using AI to assist in billing is a different risk level than helping with scheduling versus helping with technology versus advising a clinician about potential risks to a patient.
“At the farther end of that spectrum you potentially have FDA regulating software as a medical device,” he explains. “We’re going to see different levels of liability shared between providers, health systems, and vendors within the domain of these different use cases.”
The reality is, Scharfe notes, that patients seek out care because something is wrong. They don’t end up in the hospital because everything is going well.
“You’re going to have a spectrum of outcomes, and liability is no new concept. They’re constantly getting sued. So, when you introduce these systems, even if they don’t have anything to do with patient harm, they will quickly get tangled up,” he says.
Getting ahead of complacency
Enabling providers to get back time in their days offers a strong route toward avoiding these pitfalls, Scharfe explains.
“It’s an opportunity to reinvest their time in real ways,” he says. “It’s important for clinical leadership to continually remind and assert that you don’t want to be complacent, you don’t’ want to be the person who misses the mistake. Take ownership of that clinical note. Just like you can tell if someone is using AI to write an email or piece of content versus when a human does, and the same goes for the clinical note. Don’t be the clinician who has the same note for every patient.”
Vendors can use this opportunity to build in reminders to clinicians of their responsibility and build in processes so that notes are not autonomously pushed out but actually require the provider to approve it.
Another step where vendors can improve, Scharfe says, is citations.
“When you have a system that’s making inferences or suggestions or summarizing information, give a link so the provider can check the material, so they don’t have to proactively flip through years of patient history,” he says. “Have that link at the user’s fingertips so they can double check. AI can make mistakes and draw inaccurate conclusions. This is that clinical judgement we expect clinicians to have.”
Despite the challenges the industry faces, Scharfe sees an upward trajectory for how things will play out.
“The novelty of AI is that it’s a big shift but historically we’ve hung our hats on software being deterministic. You have an input and a determined output. It’s consistent,” says Scharfe. “With AI, we’re working with variable data. And more data, so the inputs are more amorphous and the outputs less deterministic. I think the reality is we need testing and governance, and we need more of a psychological approach to training and behavioral adaption.”
Matt Phillion is a freelance writer covering healthcare, cybersecurity, and more. He can be reached at matthew.phillion@gmail.com.