Artificial intelligence (AI) can help get crucial medical screening into the hands of people across the world. University of Pittsburgh engineering researcher Jingtong Hu, PhD, is working to make sure the screening is effective and fair, no matter whose hand is holding it.

AI has been deployed across a broad range of health applications, like detecting skin or cancer, recognizing emotions, monitoring vital signs, and other medical imaging and diagnostics. However, neural networks are only as good as the data set on which they are trained, and minorities are generally underrepresented in these datasets, which leads to a particularly insidious form of technological inequality.

Hu and his team at the University of Pittsburgh Swanson School of Engineering are building a distributed, inclusive data collection and learning framework that relies on smartphone apps, making it easy to participate in while protecting user privacy. The National Institutes of Health (NIH) have recently awarded Hu $1,744,696 for this work. 

“Existing and easily accessible data sets are inherently biased. It’s not always easy for people in marginalized communities to participate in data collection and research, and these communities might also lack medical professionals,” says Hu. “AI could make critical healthcare more accessible for these communities; but without a dataset that accurately reflects the diversity of the population, AI could misdiagnose people that are under-represented during the data collection stage, thereby increasing healthcare disparities.”

Hu’s project would help prevent these disparities by developing an on-device learning framework that continuously learns from new users’ data when using a mobile application. It will take advantage of federated learning (FL), which uses multiple devices to collaboratively train a shared model while keeping the data on the devices. In FL the models, instead of user data, are shared with the cloud, protecting user privacy.

“By using this method, not only can we improve the global model to be fairer by incorporating more equally represented data, but we can personalize the model for everyone. After all, the most important metrics for each user is the accuracy for him or herself,” says Hu. “A user could use our app to diagnose their skin condition, for example, to see if a skin issue is skin cancer or just normal eczema. Meanwhile, our algorithm will learn from the new images locally. Patients’ images will not be uploaded to the server; they will be analyzed on their own cell phones.”

Unlike existing frameworks, this framework would rely on unsupervised learning with data coming from a variety of smartphone models and other devices, allowing more people to participate in the study. The framework would also have to consider the fairness of different machine learning models. The team will develop a machine learning framework that will automatically search existing learning models and use the best architectures for datasets with diverse data.