AI model exhibited higher performance than expert readers in study published by Nature Medicine
An artificial intelligence model for computer-aided reading of mammograms may improve the detection of breast cancer, according to a study co-authored by UMass Medical School breast imaging expert Gopal Vijayaraghavan, MD, MPH, and published Jan. 11 in the journal Nature Medicine.
“Mammograms are currently the best screening tool to detect breast cancer early but reading and interpreting them is a visually challenging task, error prone for even experienced radiologists,” said Dr. Vijayaraghavan, associate professor of radiology, who co-authored the retrospective study with lead author Bill Lotter, PhD, chief technology officer and co-founder of DeepHealth. “We want to improve the health of women in Massachusetts with reliable tools that assist clinicians.”
The study compared the performance of five fellowship-trained radiologists and the deep-learning AI model developed by DeepHealth. The AI model uses a complex pattern recognition algorithm to detect and classify areas of concern.
The retrospective analysis was conducted on screening mammograms, known as index exams, which identified cancer in 131 patients. Of these patients, 120 had a prior mammogram within the past two years in which cancer was not identified, known as preindex exams. Readings of these exams were compared with reading of 154 age- and density-matched confirmed negative screenings conducted during the same period. All exams were for patients at UMass Memorial Medical Center, where Vijayaraghavan is chief of the Division of Breast Imaging.
Results of the 406 index, preindex and confirmed negative mammograms readings were tabulated and analyzed for sensitivity and specificity. The deep-learning algorithm performed higher than the expert readers in the diagnosis of both the index cases and the preindex examinations, with a 17.5 percent increase in sensitivity and 16.2 percent increase in specificity. The deep-learning model also performed better than earlier AI models that were also tested.
Sensitivity is the ability of a test to correctly identify patients with the disease, and specificity is the ability of a test to correctly identify people without the disease. A highly sensitive test means that there are few false negative results, meaning fewer missed cases. A highly specific test means that there are few false positives.
“Our results provide evidence that AI can aid in earlier breast cancer detection. Importantly, the AI algorithms we evaluated were not previously trained on data from sites used in the study, demonstrating an ability to generalize to new clinics,” said Dr. Lotter. “Such generalization is a common challenge in AI that is essential for real-world utility.”
These promising results are foundational for a new grant awarded to Vijayaraghavan by the Massachusetts Life Sciences Center Women’s Health Capital Call to further study the efficacy of AI in screening mammograms.
“The retrospective study showed the potential for AI,” he said. “We plan to refine those tools and apply them in a prospective manner to study the true value of AI in screening mammograms to help us detect more cancers, detect them earlier, lower recall rates for inconclusive exams, avoid unnecessary biopsies, reduce women’s anxiety, and improve provider efficiency with increased throughput and shorter reading times.”