Abstract
Deep learning based diagnostic AI systems based on medical images are starting to provide similar performance as human experts. However, these data-hungry complex systems are inherently black boxes and therefore slow to be adopted for high-risk applications like healthcare. This problem of lack of transparency is exacerbated in the case of recent large foundation models, which are trained in a self-supervised manner on millions of data points to provide robust generalisation across a range of downstream tasks. The embeddings generated from them happen through a process that is not interpretable, and hence not easily trustable for clinical applications. To address this timely issue, we deploy conformal analysis to quantify the predictive uncertainty of a vision transformer (ViT)-based foundation model across patient demographics with respect to sex, age, and ethnicity for the task of skin lesion classification using several public benchmark datasets. The significant advantage of this method is that conformal analysis is method independent, and it not only provides a coverage guarantee at the population level but also provides an uncertainty score for each individual. This is used to demonstrate the effectiveness of utilizing these embeddings for specialized tasks like diagnostic classification, meanwhile reducing computational costs. Secondly, the public benchmark datasets we used had severe class imbalance in terms of the number of samples in different classes. We used a model-agnostic dynamic F1-score-based sampling during model training, which helped to stabilize the class imbalance. We investigate the effects on uncertainty quantification (UQ) with or without this bias mitigation step. Thus, our results show how this can be used as a fairness metric to evaluate the robustness of the feature embeddings of the foundation model (Google DermFoundation), advancing the trustworthiness and fairness of clinical AI.
Author supplied keywords
Cite
CITATION STYLE
Bhattacharyya, S., Pal, U., & Chakraborti, T. (2026). Conformal uncertainty quantification to evaluate predictive fairness of foundation AI model for skin lesion classes across patient demographics. Health Information Science and Systems, 14(1). https://doi.org/10.1007/s13755-025-00412-z
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.