In an analysis of more than 32,000 CT pulmonary angiograms, the AI triage tool matched radiologist interpretations in 97.8% of cases, while discordant findings highlighted the continued importance of radiologist oversight.


An artificial intelligence (AI) algorithm for pulmonary embolism (PE) detection demonstrated high concordance with radiologist interpretations in a large, real-world study published in Radiology: Artificial Intelligence.

Researchers from Northwell Health analyzed 32,501 CT pulmonary angiograms performed across their integrated healthcare system over an 18‑month period. Overall agreement between the AI tool (AIDOC, Tel Aviv, Israel) and radiologist interpretations was 97.8%. Concordance was higher for negative exams than positive exams (98.18% vs 93.75%), underscoring the algorithm’s strength in helping to rule out PE.

Importantly, the findings also highlight the continued central role of radiologists. Among confirmed PE cases, 15% were correctly identified by radiologists but missed by the AI tool alone, demonstrating the diagnostic value of physician oversight in image interpretation (human-in-the-loop) after AI deployment.

Pulmonary embolism is a life‑threatening cardiovascular condition responsible for 5–10% of in-hospital deaths and more than 300,000 deaths annually in the United States. While several FDA-cleared AI tools are available for PE detection, few large-scale studies have evaluated their performance with a human-in-the-loop adjudication model in real-world clinical practice.

Real-World Use

Within the Northwell Health system, the AI algorithm analyzes CT pulmonary angiograms and flags suspected positive cases to assist radiologists with triage. In this study, researchers compared AI outputs with radiologist interpretations and examined cases where the two disagreed.

All discordant cases were independently reviewed and adjudicated by expert thoracic radiologists. Adjudication analysis revealed that radiologists were correct in 88.7% of disagreements (638 cases), while AI was correct in 11.3%.

Among discordant cases:

  • The AI result was PE positive, and the radiologist’s was PE negative in 25% of discordant cases; the radiologist’s negative result was correct in 85.6% of these.
  • The AI result was PE negative, and the radiologist’s was PE positive in 75% of discordant cases; radiologists correctly identified PE in 89.8% of these.

“AI-informed radiologists achieved a sensitivity of 99.2% for pulmonary embolism detection. Radiologist-AI agreement was highest for acute and central emboli—the cases associated with the greatest clinical urgency and mortality risk. This suggests the algorithm is most reliable in precisely the clinical scenarios where triage has the greatest potential to impact patient outcomes,” says Shlomit Goldberg-Stein, MD, FACR, professor of radiology and director of AI at the Zucker School of Medicine at Hofstra/Northwell, in a release.  “Ultimately, these results demonstrate that the combined expertise of radiologists with AI offers the best potential to improve PE identification in clinical practice.”

Importance of Radiologist Oversight

While agreement between AI and radiologists was high, Matthew Barish, MD, FACR, professor of radiology and vice chair of radiology informatics at the Zucker School of Medicine at Hofstra/Northwell and CMIO of enterprise imaging at Northwell Health, says that radiologist oversight remained necessary.

“Of all positive cases, 2,733 (>85%) were initially identified by AI and confirmed by radiologists showing the value of AI triage. However, 483 cases (15%) were detected only with radiologist involvement demonstrating the importance of subsequent radiologist review when AI was negative,” says Barish in a release. “Interestingly, in 26 cases AI correctly identified the PE, but this was incorrectly rejected by the initial radiologist but subsequently confirmed by the adjudicator. These findings show the value of AI-triage while also demonstrating the continued role of the radiologist in the clinical pathway.”

Pina C. Sanelli, MD, MPH, FACR, professor of radiology and vice chair of research at the Zucker School of Medicine at Hofstra/Northwell and director of the Harvey L. Neiman Health Policy Institute Policy Research and IMaging Effectiveness (PRIME) Center, adds in a release, “This large-scale evaluation demonstrates that AI achieves high agreement with radiologists in real-world clinical practice, extending beyond controlled investigational settings. However, some diagnoses would have been missed if hospitals had relied on either AI or radiologists alone, underscoring that the best outcomes are achieved when AI and radiologists work together using human-in-the-loop processes. 

“Our structured quality oversight process, including expert adjudication of discrepant cases, ensured diagnostic accuracy during deployment and extending into the post-deployment phase.  We strongly encourage other practices to adopt similar safeguards to gain trust and experience with AI.” 

ID 179593635 © Mr.suphachai Praserdumrongchai | Dreamstime.com