Generative models such as DALL-E 2 could represent a promising future tool for image generation, augmentation, and manipulation in healthcare, study shows.
A new paper published in the Journal of Medical Internet Research describes how generative models such as DALL-E 2, a novel deep learning model for text-to-image generation, could represent a promising future tool for image generation, augmentation, and manipulation in healthcare.
Do generative models have sufficient medical domain knowledge to provide accurate and useful results? Physician Lisa C. Adams, MD, and colleagues explore this topic in their latest viewpoint titled “What Does DALL-E 2 Know About Radiology?”
First introduced by OpenAI in April 2022, DALL-E 2 is an artificial intelligence (AI) tool that has gained popularity for generating novelphotorealistic images or artwork based on textual input. DALL-E 2’s generative capabilities are powerful, as it has been trained on billions of existing text-image pairs off the internet. To understand whether these capabilities can be transferred to the medical domain to create or augment data, researchers from Germany and the United
States examined DALL-E 2’s radiological knowledge in creating and manipulating x-ray, CT, MRI, and ultrasound images.
The study’s authors found that DALL-E 2 has learned relevant representations of x-ray images and shows promising potential for text-to-image generation. Specifically, DALL-E 2 was able to create realistic x-ray images based on short text prompts, but it did not perform very well when given specific CT, MRI, or ultrasound image prompts. It was also able to reasonably reconstruct missing aspects within a radiological image.
It could do much more—for example, create a complete, full-body radiograph by using only one image of the knee as a starting point. However, DALL-E 2 was limited in its capabilities to generate images with pathological abnormalities.
Synthetic data generated by DALL-E 2 could greatly accelerate the development of new deep learning tools for radiology, as well as address privacy concerns related to data sharing between institutions. The study’s authors note that generated images should be subjected to quality control by domain experts to reduce the risk of incorrect information entering a generated data set.
They also emphasize the need for further research to fine-tune these models to medical data and incorporate medical terminology to create powerful models for data generation and augmentation in radiology research. Although DALL-E 2 is not available to the public to fine-tune, other generative models like Stable Diffusion are, which could be adapted to generate a variety of medical images.
Overall, this viewpoint published by JMIR Publications provides a promising outlook for the future of AI image generation in radiology. Further research and development in this area could lead to exciting new tools for radiologists and medical professionals.
While there are limitations to be addressed, the potential benefits of using tools like DALL-E 2 and ChatGPT in research and medical training and education are significant. To this end, JMIR Medical Education is now inviting submissions for a new e-collection on the use of generative language models in medical education, as announced in a recent editorial by Gunther Eysenbach, MD, MPH, FACMI.