Interpretable generative deep learning: an illustration with single cell gene expression data.

Hum Genet

Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, 79104, Germany.

Published: September 2022


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Deep generative models can learn the underlying structure, such as pathways or gene programs, from omics data. We provide an introduction as well as an overview of such techniques, specifically illustrating their use with single-cell gene expression data. For example, the low dimensional latent representations offered by various approaches, such as variational auto-encoders, are useful to get a better understanding of the relations between observed gene expressions and experimental factors or phenotypes. Furthermore, by providing a generative model for the latent and observed variables, deep generative models can generate synthetic observations, which allow us to assess the uncertainty in the learned representations. While deep generative models are useful to learn the structure of high-dimensional omics data by efficiently capturing non-linear dependencies between genes, they are sometimes difficult to interpret due to their neural network building blocks. More precisely, to understand the relationship between learned latent variables and observed variables, e.g., gene transcript abundances and external phenotypes, is difficult. Therefore, we also illustrate current approaches that allow us to infer the relationship between learned latent variables and observed variables as well as external phenotypes. Thereby, we render deep learning approaches more interpretable. In an application with single-cell gene expression data, we demonstrate the utility of the discussed methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9360114PMC
http://dx.doi.org/10.1007/s00439-021-02417-6DOI Listing

Publication Analysis

Top Keywords

gene expression
12
expression data
12
deep generative
12
generative models
12
observed variables
12
deep learning
8
models learn
8
omics data
8
single-cell gene
8
relationship learned
8

Similar Publications

Idiopathic multicentric Castleman disease (iMCD) is a rare lymphoproliferative disorder characterized by systemic inflammation and lymphadenopathy. Two major clinical subtypes, idiopathic plasmacytic lymphadenopathy (iMCD-IPL) and iMCD with thrombocytopenia, anasarca, fever, renal dysfunction/reticulin fibrosis, and organomegaly (iMCD-TAFRO), exhibit distinct pathophysiologic mechanisms. While interleukin-6 (IL-6) is known to be elevated in iMCD, the differences in IL-6 production sources between subtypes remain unclear.

View Article and Find Full Text PDF

The estrogen receptor (ER or ERα) remains the primary therapeutic target for luminal breast cancer, with current treatments centered on competitive antagonists, receptor down-regulators, and aromatase inhibitors. Despite these options, resistance frequently emerges, highlighting the need for alternative targeting strategies. We discovered a novel mechanism of ER inhibition that targets the previously unexplored interface between the DNA-binding domain (DBD) and ligand-binding domain (LBD) of the receptor.

View Article and Find Full Text PDF

NAD Metabolism Regulates Proliferation of Macrophages in Atherosclerosis.

Arterioscler Thromb Vasc Biol

September 2025

Department of Medicine/Division of Cardiology, University of California Los Angeles. (S.S., C.R.S., L.F., M.P., C.P., Z.Z., J.J.M., R.C.D., D.S., A.J.L.).

Background: In genetic studies with the Hybrid Mouse Diversity Panel, we previously identified a chromosome 9 locus for atherosclerosis. We now identify NNMT (nicotinamide -methyltransferase), an enzyme that degrades nicotinamide, as the causal gene in the locus and show that the underlying mechanism involves salvage of nicotinamide to nicotinamide adenine dinucleotide (NAD).

Methods: Gain/loss of function studies in macrophages were performed to examine the role of NAD levels in macrophage proliferation and apoptosis in atherosclerosis.

View Article and Find Full Text PDF

Background: Previous studies have suggested that the associations between ambient air pollution and atherosclerotic cardiovascular diseases (ASCVD) differ by genotype. A genome-wide approach provides a more comprehensive understanding of this relationship on a genomic scale.

Methods: Using data from ≈300 000 UK Biobank participants, we conducted a genome-wide interaction analysis on 10 745 802 variants.

View Article and Find Full Text PDF

Activated B-cell diffuse large B-cell lymphoma (ABC-DLBCL) is an aggressive cancer with poor response to standard chemotherapy. In search of new therapeutic leads, a library of 435 fractions prepared from the Irish marine biorepository was screened against 2 ABC-DLBCL cell lines (TMD8 and OCI-Ly10) and a non-cancerous control cell line (CB33). Active fractions are prioritized based on potency and selectivity.

View Article and Find Full Text PDF