Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images.

EBioMedicine

Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Published: April 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: It has been shown that AI models can learn race on medical images, leading to algorithmic bias. Our aim in this study was to enhance the fairness of medical image models by eliminating bias related to race, age, and sex. We hypothesise models may be learning demographics via shortcut learning and combat this using image augmentation.

Methods: This study included 44,953 patients who identified as Asian, Black, or White (mean age, 60.68 years ±18.21; 23,499 women) for a total of 194,359 chest X-rays (CXRs) from MIMIC-CXR database. The included CheXpert images comprised 45,095 patients (mean age 63.10 years ±18.14; 20,437 women) for a total of 134,300 CXRs were used for external validation. We also collected 1195 3D brain magnetic resonance imaging (MRI) data from the ADNI database, which included 273 participants with an average age of 76.97 years ±14.22, and 142 females. DL models were trained on either non-augmented or augmented images and assessed using disparity metrics. The features learned by the models were analysed using task transfer experiments and model visualisation techniques.

Findings: In the detection of radiological findings, training a model using augmented CXR images was shown to reduce disparities in error rate among racial groups (-5.45%), age groups (-13.94%), and sex (-22.22%). For AD detection, the model trained with augmented MRI images was shown 53.11% and 31.01% reduction of disparities in error rate among age and sex groups, respectively. Image augmentation led to a reduction in the model's ability to identify demographic attributes and resulted in the model trained for clinical purposes incorporating fewer demographic features.

Interpretation: The model trained using the augmented images was less likely to be influenced by demographic information in detecting image labels. These results demonstrate that the proposed augmentation scheme could enhance the fairness of interpretations by DL models when dealing with data from patients with different demographic backgrounds.

Funding: National Science and Technology Council (Taiwan), National Institutes of Health.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10945176PMC
http://dx.doi.org/10.1016/j.ebiom.2024.105047DOI Listing

Publication Analysis

Top Keywords

model trained
12
image augmentation
8
medical images
8
enhance fairness
8
age sex
8
women total
8
database included
8
augmented images
8
disparities error
8
error rate
8

Similar Publications

Systematic analyses uncover plasma proteins linked to incident cardiovascular diseases.

Protein Cell

August 2025

Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China.

Cardiovascular disease (CVD) research is hindered by limited comprehensive analyses of plasma proteome across disease subtypes. Here, we systematically investigated the associations between plasma proteins and cardiovascular outcomes in 53,026 UK Biobank participants over a 14-year follow-up. Association analyses identified 3,089 significant associations involving 892 unique protein analytes across 13 CVD outcomes.

View Article and Find Full Text PDF

Background: Poststroke cognitive impairment (PSCI) affects 30% to 50% of stroke survivors, severely impacting functional outcomes and quality of life. This study uses functional near-infrared spectroscopy (fNIRS) to assess task-evoked brain activation and its potential for stratifying the severity in patients with PSCI.

Method: A cross-sectional study was conducted at Nanchong Central Hospital between June 2023 and April 2024.

View Article and Find Full Text PDF

A robust deep learning-driven framework for detecting Parkinson's disease using EEG.

Comput Methods Biomech Biomed Engin

September 2025

Institute of Radio Physics and Electronics, University of Calcutta, Kolkata, India.

Parkinson's disease (PD) is a neurodegenerative condition that impairs motor functions. Accurate and early diagnosis is essential for enhancing well-being and ensuring effective treatment. This study proposes a deep learning-based approach for PD detection using EEG signals.

View Article and Find Full Text PDF

Introduction: Vision language models (VLMs) combine image analysis capabilities with large language models (LLMs). Because of their multimodal capabilities, VLMs offer a clinical advantage over image classification models for the diagnosis of optic disc swelling by allowing a consideration of clinical context. In this study, we compare the performance of non-specialty-trained VLMs with different prompts in the classification of optic disc swelling on fundus photographs.

View Article and Find Full Text PDF