Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Relation extraction (RE) is a fundamental task for extracting gene-disease associations from biomedical text. Many state-of-the-art tools have limited capacity, as they can extract gene-disease associations only from single sentences or abstract texts. A few studies have explored extracting gene-disease associations from full-text articles, but there exists a large room for improvements. In this work, we propose RENET2, a deep learning-based RE method, which implements Section Filtering and ambiguous relations modeling to extract gene-disease associations from full-text articles. We designed a novel iterative training data expansion strategy to build an annotated full-text dataset to resolve the scarcity of labels on full-text articles. In our experiments, RENET2 achieved an F1-score of 72.13% for extracting gene-disease associations from an annotated full-text dataset, which was 27.22, 30.30, 29.24 and 23.87% higher than BeFree, DTMiner, BioBERT and RENET, respectively. We applied RENET2 to (i) ∼1.89M full-text articles from PubMed Central and found ∼3.72M gene-disease associations; and (ii) the LitCovid articles and ranked the top 15 proteins associated with COVID-19, supported by recent articles. RENET2 is an efficient and accurate method for full-text gene-disease association extraction. The source-code, manually curated abstract/full-text training data, and results of RENET2 are available at GitHub.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8256824PMC
http://dx.doi.org/10.1093/nargab/lqab062DOI Listing

Publication Analysis

Top Keywords

gene-disease associations
24
full-text articles
16
training data
12
extracting gene-disease
12
full-text
8
gene-disease
8
full-text gene-disease
8
relation extraction
8
iterative training
8
data expansion
8

Similar Publications

With advances in next-generation sequencing technologies, individuals can seek genetic risk information for multiple conditions. However, feasibility and communication challenges could arise if offering multiple genetic tests simultaneously, such as cancer predisposition testing and carrier screening for pregnancy planning. Genetic screening introduces uncertainty from probabilistic results, ambiguous gene-disease associations, and complex variant interpretation, intertwining with psychosocial concerns impacting decision-making and emotional well-being.

View Article and Find Full Text PDF

Purpose: Genetic variation underlying rare diseases in Arab populations is poorly understood limiting effective carrier screening for recessive disorders, which are prevalent because of high consanguineous rates.

Methods: Using the American College of Medical Genetics and Genomics/Association for Molecular Pathology guidelines, we curated pathogenic (P) and likely pathogenic (LP) variants in 1333 Arab Emirati families (346 internal cohort and 987 from the literature). We also analyzed P/LP variants in 1194 Emirati exomes, calculated allele frequencies, and estimated carrier rates for the associated recessive conditions.

View Article and Find Full Text PDF

Background: Oxytocin (OXT), a neuropeptide involved in social behaviors and emotions, exhibits bidirectional effects depending upon positive or negative environments. Our previous report highlighted dysregulation of OXT on striatocortical functional connectivity (FC) in bipolar disorder (BD) patients. We hypothesized that: (1) in healthy controls (HC), carriers of a "sensitive" OXTR allele would show altered FC, particularly in association with childhood trauma; and (2) this gene-brain relationship would be fundamentally altered or reversed in BD patients, reflecting a gene-disease interaction.

View Article and Find Full Text PDF

Introduction: Diabetes mellitus (DM) is a known risk factor for various cancers, but its relationship with head and neck squamous cell carcinoma (HNSCC) remains unclear. This study explores clinical and molecular links between DM and HNSCC through integrative analyses of patient data and bioinformatics.

Methods: A retrospective cohort of 728 HNSCC patients was analyzed to assess sex-specific co-occurrence with DM.

View Article and Find Full Text PDF