Blinded, Randomized Clinical Trial Suggests AI ‘Near-Ready For Prime Time’
By Deborah Borfitz
May 16, 2023 | In a first-of-its-kind, blinded and randomized clinical trial of artificial intelligence (AI) in cardiology, investigators at Cedars-Sinai Medical Center have shown that an “AI assistant” for assessing heart health works even better than sonographers at reading echocardiograms. The software was integrated with the institution’s picture archiving and communication system that securely stores and digitally transmits electronic images, a common fixture at most hospitals, according to cardiologist David Ouyang, M.D., principal investigator of the study.
As reported recently in Nature (DOI: 10.1038/s41586-023-05947-3), the study pitted sonographers against AI in evaluating 3,495 transthoracic echocardiogram studies. The big strength of the study, says Ouyang, was that cardiologists were not told which echocardiogram assessments were done by sonographers and which by AI. “We asked the cardiologists to overread and adjust as necessary and evaluated whether AI or sonographer initial interpretations were changed more frequently.”
Among the findings were that cardiologists more frequently agreed with the AI initial assessment and made corrections to only 16.8% of the initial assessments made by AI. They made corrections to 27.2% of the initial assessments made by the sonographers. Additionally, the AI-guided workflow was found to save time for both sonographers and cardiologists.
What this means, Ouyang says, is that “AI is near-ready for prime time.” Much of the AI technology already cleared by the Food and Drug Administration (FDA) has not gone through this degree of rigor, since the agency doesn’t require such products to undergo testing in clinical trials. But the Cedars-Sinai team felt the gold-standard assessment method was the only way to definitively demonstrate that its AI software works in the clinical workflow under physician supervision—that is, semi-autonomously, as is the case with most FDA-approved AI to date.
The same technology, which Cedars-Sinai’s Smidt Heart Institute developed in partnership with Stanford University, was validated against an historical dataset in 2020 (Nature, DOI: 10.1038/s41586-020-2145-8). This is all the validation that is typically required by the FDA, says Ouyang, “but retrospective review can be biased, cherry picked, or change over time. Blinding and randomization, which we did for this trial, are foundational elements of clinical trials and rigorous medical science.”
Researchers hope to obtain FDA approval of their AI cardiac function assessment tool, a process that typically takes six to nine months. Thereafter, they are free to distribute their AI assistant like any other software, Ouyang says.
The team has a patent on the technology, based on a convolutional neural network, but their goal is to make it openly available, he adds. They’re debating the idea of handing out the algorithm free of charge to select institutions for further real-world testing.
Added Precision
“Even with great clinicians, there is still under-diagnosis and delayed diagnosis [of heart problems] and I think that’s an area where we can use AI to help,” says Ouyang in speaking to the clinical importance of the software. This is due in part to a significant shortage of skilled sonographers to conduct echocardiograms and help interpret the results.
Echocardiograms measure let ventricular ejection fraction (LVEF), indicative of how well a heart is pumping blood. Even when the imaging gets done by top-notch sonographers, Ouyang explains, they sometimes miss important but subtle changes because they’re relying on manual and subjective human tracings. The supervised AI system effectively helps physicians “see things more precisely and minimize tedious steps in the workflow.”
The latest clinical trial should erase any skepticism physicians may have had about whether the AI assistant “truly works and lives up to the hype,” he says. “I think we met that bar.”
Some sonographers may need reassurance that their job is not at risk now or in the future, Ouyang adds. He says they need not worry. “The goal is to streamline parts of the work... as well as make care more efficient.”
Rigorous validation of the software could also lower the barrier to entry for imaging labs interested in doing echocardiograms for clinical trial purposes, says Ouyang. “The Achilles heel of echo is its imprecision, which is why a lot of trials rely on core labs [staffed] with skilled clinicians to overread the studies.”
Ouyang has at least two standard pieces of advice for other hospitals trying to figure out what good AI looks like. The first is to ask how much data was used in its training. While some FDA-approved AI has been training on only 1,000 examples, Cedar-Sinai has its algorithm learning on 150,000 different studies.
Second, he suggests asking if the AI software has ever been evaluated in a blinded, unbiased setting. The potential risks of an unblinded review include bias by reviewers who know and like AI and therefore may be “gentler” in their assessment of whether it did a good job.
In related work, Ouyang has a clinical trial underway looking at AI-guided echocardiographic screening for cardiac amyloidosis, a rare disease. Multiple vendors are also starting to integrate AI into the hardware that is already being used by clinicians, he says. “I think we can expect to see a lot of deployment and implementation in the near future.”