In an experimental study published in the journal Radiology, researchers at the University of Maryland School of Medicine (UMSOM), say that the answers generated by ChatGPT provide correct information the majority of the time; sometimes, though, the information is inaccurate.

In February 2023, UMSOM researchers explained that they created a set of 25 questions related to advice on getting screened for breast cancer. They submitted each question to ChatGPT three times to see what responses were generated. Three radiologists fellowship-trained in mammography evaluated the responses and found that the responses were appropriate for 22 out of the 25 questions. The chatbot did, however, provide one answer based on outdated information. Two other questions reportedly had inconsistent responses that varied each time the same question was posed.

“We found ChatGPT answered questions correctly about 88% of the time, which is pretty amazing,” shared study corresponding author Paul Yi , MD, Assistant Professor of Diagnostic Radiology and Nuclear Medicine at UMSOM and Director of the UM Medical Intelligent Imaging Center (UM2ii). “It also has the added benefit of summarizing information into an easily digestible form for consumers to easily understand.”

The experts explained that ChatGPT correctly answered questions about the symptoms of breast cancer, who is at risk, and questions on the cost, age, and frequency recommendations concerning mammograms. However, according to the researchers, it is not as comprehensive in its responses as what a person would normally find on a Google search.

“ChatGPT provided only one set of recommendations on breast cancer screening, issued from the American Cancer Society, but did not mention differing recommendations put out by the Centers for Disease Control and Prevention (CDC) or the US Preventative Services Task Force (USPSTF),” explained study lead author Hana Haver, MD, a radiology resident at the University of Maryland Medical Center.

In one response deemed by the researchers to be inappropriate, ChatGPT provided an outdated response to planning a mammogram around COVID-19 vaccination. The advice to delay a mammogram for four to six weeks after getting a COVID-19 shot was changed in February 2022, and the CDC endorses the USPSTF guidelines, which don’t recommend waiting. Researchers also found that inconsistent responses were given to questions concerning an individual’s personal risk of getting breast cancer and on where someone could get a mammogram.

“We’ve seen in our experience that ChatGPT sometimes makes up fake journal articles or health consortiums to support its claims,” says Yi. “Consumers should be aware that these are new, unproven technologies, and should still rely on their doctor, rather than ChatGPT, for advice.”

He and his colleagues are reportedly analyzing how ChatGPT fares for lung cancer screening recommendations and identifying ways to improve the recommendations made by ChatGPT to be more accurate and complete, as well as understandable to those without a high level of education.

“With the rapid evolution of ChatGPT and other large language models, we have a responsibility as a medical community to evaluate these technologies and protect our patients from potential harm that may come from incorrect screening recommendations or outdated preventive health strategies,” stated Mark T. Gladwin, MD, Dean, UMSOM, Vice President for Medical Affairs, University of Maryland, and John Z. and Akiko K. Bowers Distinguished Professor.