98%
921
2 minutes
20
This study addresses the challenges of confounding effects and interpretability in artificial-intelligence-based medical image analysis. Whereas existing literature often resolves confounding by removing confounder-related information from latent representations, this strategy risks affecting image reconstruction quality in generative models, thus limiting their applicability in feature visualization. To tackle this, we propose a different strategy that retains confounder-related information in latent representations while finding an alternative confounder-free representation of the image data. Our approach views the latent space of an autoencoder as a vector space, where imaging-related variables, such as the learning target (t) and confounder (c), have a vector capturing their variability. The confounding problem is addressed by searching a confounder-free vector which is orthogonal to the confounder-related vector but maximally collinear to the target-related vector. To achieve this, we introduce a novel correlation-based loss that not only performs vector searching in the latent space, but also encourages the encoder to generate latent representations linearly correlated with the variables. Subsequently, we interpret the confounder-free representation by sampling and reconstructing images along the confounder-free vector. The efficacy and flexibility of our proposed method are demonstrated across three applications, accommodating multiple confounders and utilizing diverse image modalities. Results affirm the method's effectiveness in reducing confounder influences, preventing wrong or misleading associations, and offering a unique visual interpretation for in-depth investigations by clinical and epidemiological researchers. The code is released in the following GitLab repository: https://gitlab.com/radiology/compopbio/ai_based_association_analysis.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.media.2025.103529 | DOI Listing |
PLoS Comput Biol
September 2025
Faculty of Science, Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, the Netherlands.
Predictive coding (PC) proposes that our brains work as an inference machine, generating an internal model of the world and minimizing predictions errors (i.e., differences between external sensory evidence and internal prediction signals).
View Article and Find Full Text PDFIEEE Trans Comput Biol Bioinform
September 2025
The rapid advancement of single-cell sequencing technology has generated vast amounts of multi-omics data, presenting unprecedented opportunities for single-cell multi-omics clustering analysis. However, existing single-cell clustering algorithms focus on extracting shared representations, overlooking the interactions and correlations among cells. This oversight inevitably leads to biased or confounded cell clustering results.
View Article and Find Full Text PDFNat Biomed Eng
September 2025
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
Phenotype-driven approaches identify disease-counteracting compounds by analysing the phenotypic signatures that distinguish diseased from healthy states. Here we introduce PDGrapher, a causally inspired graph neural network model that predicts combinatorial perturbagens (sets of therapeutic targets) capable of reversing disease phenotypes. Unlike methods that learn how perturbations alter phenotypes, PDGrapher solves the inverse problem and predicts the perturbagens needed to achieve a desired response by embedding disease cell states into networks, learning a latent representation of these states, and identifying optimal combinatorial perturbations.
View Article and Find Full Text PDFJ Med Internet Res
September 2025
Artificial Intelligence and Mathematical Modeling Lab, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
Background: The H5N1 avian influenza A virus represents a serious threat to both animal and human health, with the potential to escalate into a global pandemic. Effective monitoring of social media during H5N1 avian influenza outbreaks could potentially offer critical insights to guide public health strategies. Social media platforms like Reddit, with their diverse and region-specific communities, provide a rich source of data that can reveal collective attitudes, concerns, and behavioral trends in real time.
View Article and Find Full Text PDFBrief Bioinform
August 2025
School of Computer Science, Xi'an Polytechnic University, 710048, Xi'an, China.
Cancer, with its inherent heterogeneity, is commonly categorized into distinct subtypes based on unique traits, cellular origins, and molecular markers specific to each type. However, current studies primarily rely on complete multi-omics datasets for predicting cancer subtypes, often overlooking predictive performance in cases where some omics data may be missing and neglecting implicit relationships across multiple layers of omics data integration. This paper introduces Multi-Layer Matrix Factorization (MLMF), a novel approach for cancer subtyping that employs multi-omics data clustering.
View Article and Find Full Text PDF