Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Motivation: A significant portion of molecular biology investigates signalling pathways and thus depends on an up-to-date and complete resource of functional protein-protein associations (PPAs) that constitute such pathways. Despite extensive curation efforts, major pathway databases are still notoriously incomplete. Relation extraction can help to gather such pathway information from biomedical publications. Current methods for extracting PPAs typically rely exclusively on rare manually labelled data which severely limits their performance.

Results: We propose PPA Extraction with Deep Language (PEDL), a method for predicting PPAs from text that combines deep language models and distant supervision. Due to the reliance on distant supervision, PEDL has access to an order of magnitude more training data than methods solely relying on manually labelled annotations. We introduce three different datasets for PPA prediction and evaluate PEDL for the two subtasks of predicting PPAs between two proteins, as well as identifying the text spans stating the PPA. We compared PEDL with a recently published state-of-the-art model and found that on average PEDL performs better in both tasks on all three datasets. An expert evaluation demonstrates that PEDL can be used to predict PPAs that are missing from major pathway databases and that it correctly identifies the text spans supporting the PPA.

Availability And Implementation: PEDL is freely available at https://github.com/leonweber/pedl. The repository also includes scripts to generate the used datasets and to reproduce the experiments from this article.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355289PMC
http://dx.doi.org/10.1093/bioinformatics/btaa430DOI Listing

Publication Analysis

Top Keywords

deep language
12
distant supervision
12
pedl
8
protein-protein associations
8
language models
8
models distant
8
major pathway
8
pathway databases
8
manually labelled
8
predicting ppas
8

Similar Publications

Protein phosphorylation regulates protein function and cellular signaling pathways, and is strongly associated with diseases, including neurodegenerative disorders and cancer. Phosphorylation plays a critical role in regulating protein activity and cellular signaling by modulating protein-protein interactions (PPIs). It alters binding affinities and interaction networks, thereby influencing biological processes and maintaining cellular homeostasis.

View Article and Find Full Text PDF

Brain Tumor Segmentation (BTS) is crucial for accurate diagnosis and treatment planning, but existing CNN and Transformer-based methods often struggle with feature fusion and limited training data. While recent large-scale vision models like Segment Anything Model (SAM) and CLIP offer potential, SAM is trained on natural images, lacking medical domain knowledge, and its decoder struggles with accurate tumor segmentation. To address these challenges, we propose the Medical SAM-Clip Grafting Network (MSCG), which introduces a novel SC-grafting module.

View Article and Find Full Text PDF

This AI-assisted review article offers a dual review: a book review of Living with Risk in the Late Roman World by Cam Grey, and a critical review of the current potential of large language models (LLMs), specifically ChatGPT's DeepResearch mode, to assist in thoughtful and scholarly book reviewing within risk science. Grey's book presents an innovative reconstruction of how communities in the late Roman Empire perceived and adapted to chronic environmental and societal risks, emphasizing spatial variability, cultural interpretation, and the normalization of uncertainty. Drawing on commentary from a human reviewer and a parallel AI-assisted analysis, we compare the distinct strengths and limitations of each approach.

View Article and Find Full Text PDF

Uncovering differential tolerance to deletions versus substitutions with a protein language model.

Cell Syst

September 2025

Diabetes Center, University of California, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA; Department of Bioengineering & Therapeutic

Deep mutational scanning (DMS) experiments have been successfully leveraged to understand genotype to phenotype mapping. However, the overwhelming majority of DMS have focused on amino acid substitutions. Thus, it remains unclear how indels differentially shape the fitness landscape relative to substitutions.

View Article and Find Full Text PDF

The detection of algebraic auditory structures emerges with self-supervised learning.

PLoS Comput Biol

September 2025

Laboratoire des Systèmes Perceptifs, Département d'études Cognitives, École Normale Supérieure, PSL University, CNRS, Paris, France.

Humans can spontaneously detect complex algebraic structures. Historically, two opposing views explain this ability, at the root of language and music acquisition. Some argue for the existence of an innate and specific mechanism.

View Article and Find Full Text PDF