PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration.

PLoS Comput Biol

Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, United Kingdom.

Published: March 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10994553PMC
http://dx.doi.org/10.1371/journal.pcbi.1011814DOI Listing

Publication Analysis

Top Keywords

multi-omics datasets
12
multi-omics data
8
integration interpretation
8
multi-omics
7
pathintegrate
6
data
6
pathintegrate multivariate
4
multivariate modelling
4
modelling approaches
4
approaches pathway-based
4

Similar Publications

Pseudoautosomal regions (PARs), located at the ends of sex chromosomes, harbor genes that may play a role in tumor pathology by regulating cell proliferation and the immune microenvironment. Gastric cancer (GC) is a prevalent and molecularly heterogeneous malignancy of the digestive system. However, studies on the role of PARs-related genes in GC are limited.

View Article and Find Full Text PDF

Lung cancer is the most common cause of cancer-related death worldwide. Recent advancements in targeted therapies and immunotherapies have achieved remarkable success. However, patient responses to treatments with lung cancer vary substantially.

View Article and Find Full Text PDF

Summary: Spatial omics is a young and evolving field and as such shows rapid development of novel technologies and analysis methods to measure transcripts, proteins, metabolites, and post-translational modifications at high spatial resolution. These advances in technology have enabled the simultaneous generation of abundance profiles for multiple different omics types and associated microscopy imaging data, as well as their analysis in a spatial context. However, most analytical tools are designed for spatial transcriptomics platforms and are challenging to use in other contexts such as mass spectrometry-based measurements or metagenomics.

View Article and Find Full Text PDF

Background: Bladder cancer (BLCA) is a prevalent malignancy with substantial consequences for patient health. This study aimed to elucidate the underlying mechanisms of BLCA through integrated multi-omics analysis.

Methods: Tumor and adjacent tissues from BLCA patients underwent transcriptomic, whole-exome sequencing, metabolomic, and intratumoral microbiome analyses.

View Article and Find Full Text PDF

Introduction: Alzheimer's disease (AD) lacks effective biomarkers and diseasemodifying therapies. This study explored transcriptomic dysregulation, immune-metabolic crosstalk, and drug repurposing opportunities in AD.

Methods: Transcriptomic datasets (GSE109887, GSE5281) were harmonized using batch correction.

View Article and Find Full Text PDF