Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Predicting protein-protein interactions (PPIs) from sequence data is a key challenge in computational biology. While various computational methods have been proposed, the utilization of sequence embeddings from protein language models, which contain diverse information, including structural, evolutionary, and functional aspects, has not been fully exploited. Additionally, there is a significant need for a comprehensive neural network capable of efficiently extracting these multifaceted representations.

Results: Addressing this gap, we propose xCAPT5, a novel hybrid classifier that uniquely leverages the T5-XL-UniRef50 protein large language model for generating rich amino acid embeddings from protein sequences. The core of xCAPT5 is a multi-kernel deep convolutional siamese neural network, which effectively captures intricate interaction features at both micro and macro levels, integrated with the XGBoost algorithm, enhancing PPIs classification performance. By concatenating max and average pooling features in a depth-wise manner, xCAPT5 effectively learns crucial features with low computational cost.

Conclusion: This study represents one of the initial efforts to extract informative amino acid embeddings from a large protein language model using a deep and wide convolutional network. Experimental results show that xCAPT5 outperforms recent state-of-the-art methods in binary PPI prediction, excelling in cross-validation on several benchmark datasets and demonstrating robust generalization across intra-species, cross-species, inter-species, and stringent similarity contexts.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10924985PMC
http://dx.doi.org/10.1186/s12859-024-05725-6DOI Listing

Publication Analysis

Top Keywords

protein language
12
language model
12
deep wide
8
embeddings protein
8
neural network
8
amino acid
8
acid embeddings
8
xcapt5
5
protein
5
xcapt5 protein-protein
4

Similar Publications

Inflammatory gene expression profile of oral plasmablastic lymphoma.

Virchows Arch

September 2025

Department of Oral Surgery and Pathology, School of Dentistry, Universidade Federal de Minas Gerais, Minas Gerais, Av. Antônio Carlos, Pampulha, Belo Horizonte, 31270-901, Brazil.

Plasmablastic lymphoma (PBL) is a rare and aggressive non-Hodgkin lymphoma with a poor prognosis and short survival rates. It is classified as a large B-cell lymphoma subtype, but carries a plasmacytic immunophenotype. Therefore, PBL has pathogenetic overlaps with diffuse large B-cell lymphoma not otherwise specified (DLBCL NOS) and plasma cell neoplasms (PCNs).

View Article and Find Full Text PDF

Circadian oscillations of gene transcripts rely on a negative feedback loop executed by the activating BMAL1-CLOCK heterodimer and its negative regulators PER and CRY. Although circadian rhythms and CLOCK protein are mostly absent during embryogenesis, the lack of BMAL1 during prenatal development causes an early aging phenotype during adulthood, suggesting that BMAL1 performs an unknown non-circadian function during organism development that is fundamental for healthy adult life. Here, we show that BMAL1 interacts with TRIM28 and facilitates H3K9me3-mediated repression of transposable elements in naïve pluripotent cells, and that the loss of BMAL1 function induces a widespread transcriptional activation of MERVL elements, 3D genome reorganization and the acquisition of totipotency-associated molecular and cellular features.

View Article and Find Full Text PDF

Motivation: Recent pandemics have revealed significant gaps in our understanding of viral pathogenesis, exposing an urgent need for methods to identify and prioritize key host proteins (host factors) as potential targets for antiviral treatments. De novo generation of experimental datasets is limited by their heterogeneity, and for looming future pandemics, may not be feasible due to limitations of experimental approaches.

Results: Here we present TransFactor, a computational framework for predicting and prioritizing candidate host factors using only protein sequence data.

View Article and Find Full Text PDF

Background: Several studies have suggested that adult human dermal fibroblasts (HDFa) may be a potential alternative source to mesenchymal stem cells for cell therapies. This study aims to characterize HDFa, adipose-derived stem cells (ADMSCs) and dental pulp stem cells (DPSCs) to investigate their proliferation, differentiation potential, mitochondrial respiration, and metabolomic profile. We identified molecules and characteristics that would differentiate MSCs from different sources or confirm their uniformity.

View Article and Find Full Text PDF

The COVID-19 pandemic, caused by the continuously evolving SARS-CoV-2 virus, has presented persistent global health challenges. As novel variants emerge, many with enhanced transmissibility and immune evasion capabilities, concerns have intensified regarding the efficacy of existing vaccines and therapeutics. This review provides a comprehensive overview of the current landscape of COVID-19 vaccination, including the development and performance of monovalent and bivalent boosters, and examines their effectiveness against newly emerging variants of interest (VOIs) and variants under monitoring (VUMs), such as JN.

View Article and Find Full Text PDF