Assistance from an artificial intelligence (AI) algorithm with high diagnostic accuracy improved radiologist performance in detecting lung cancers on chest X-rays and increased human acceptance of AI suggestions, according to a study published in Radiology, a journal of the Radiological Society of North America (RSNA).
While AI-based image diagnosis has advanced rapidly in the medical field, the factors affecting radiologists’ diagnostic determinations in AI-assisted image reading remain underexplored. Researchers at South Korea-based Seoul National University looked at how these factors might influence the detection of malignant lung nodules during AI-assisted reading of chest X-rays.
In this retrospective study, 30 readers, including 20 thoracic radiologists with five to 18 years of experience and 10 radiology residents with only two to three years of experience, assessed 120 chest X-rays without AI. Of the 120 chest radiographs assessed, 60 were from lung cancer patients (32 males) and 60 were controls (36 males). Patients had a median age of 67 years. In a second session, each group reinterpreted the X-rays, assisted by either a high- or low-accuracy AI. The readers were blind to the fact that two different AIs were used.
Use of the high accuracy AI improved readers’ detection performance to a greater extent than low-accuracy AI. Use of high-accuracy AI also led to more frequent changes in reader determinations—a concept known as susceptibility.
“It is possible that the relatively large sample size in this study bolstered readers’ confidence in the AI’s suggestions,” says study lead author Chang Min Park, MD, PhD, from the department of radiology and Institute of Radiation Medicine at Seoul National University College of Medicine in Seoul. “We think this issue of human trust in AI is what we observed in the susceptibility in this study: humans are more susceptible to AI when using high diagnostic performance AI.”
Compared to the first reading session, readers assisted by the high diagnostic accuracy AI at the second reading session showed higher per-lesion sensitivity (0.63 versus 0.53), and specificity (0.94 versus 0.88). Alternatively, readers assisted by the low diagnostic accuracy AI at the second reading session did not show improvement between the two reading sessions for any of these measurements.
“Our study suggests that AI can help radiologists, but only when the AI’s diagnostic performance meets or exceeds that of the human reader,” Park says.
The results underline the importance of using high diagnostic performance AI. However, Park notes that the definition of high diagnostic performance AI can vary depending on the task and the clinical context in which it will be used. For example, an AI model that can detect all abnormalities on chest X-rays may seem ideal. But in practice, such a model would have limited value in reducing the workload in a pulmonary tuberculosis mass screening setting.
“Therefore, our study suggests that clinically appropriate use of AI requires both the development of high-performance AI models for given tasks and considerations about the relevant clinical setting to which that AI will be applied,” Park says.
In the future, the researchers want to expand their work on human-AI collaboration to other abnormalities on chest X-rays and CT images.