98%
921
2 minutes
20
Motivation: Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins.
Results: We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles.
Availabilityand Implementation: The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8088328 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btaa823 | DOI Listing |
Genome Biol
September 2025
National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
Background: Soil salinization represents a critical global challenge to agricultural productivity, profoundly impacting crop yields and threatening food security. Plant salt-responsive is complex and dynamic, making it challenging to fully elucidate salt tolerance mechanism and leading to gaps in our understanding of how plants adapt to and mitigate salt stress.
Results: Here, we conduct high-resolution time-series transcriptomic and metabolomic profiling of the extremely salt-tolerant maize inbred line, HLZY, and the salt-sensitive elite line, JI853.
Mamm Genome
September 2025
Department of Animal Health and Anatomy, Center for Animal Biotechnology and Gene Therapy, Universitat Autònoma de Barcelona, Travessera Dels Turons, 08193, Cerdanyola del Vallès, Barcelona, Spain.
The mouse remains the principal animal model for investigating human diseases due, among other reasons, to its anatomical similarities to humans. Despite its widespread use, the assumption that mouse anatomy is a fully established field with standardized and universally accepted terminology is misleading. Many phenotypic anatomical annotations do not refer to the authority or origin of the terminology used, while others inappropriately adopt outdated or human-centric nomenclature.
View Article and Find Full Text PDFSci Justice
September 2025
Department of Micro Traces Evidence Examination, Institute of Forensic Science, Beijing, China. Electronic address:
Homemade explosives (HMEs) present significant challenges to forensic investigations due to their diverse chemical compositions and varying construction methods. Identifying the origin of these explosives is crucial for linking evidence across crime scenes. To address this challenge, this study employs an advanced data mining technique to enhance the forensic analysis of a unique dataset consisting of 344 HME samples collected from 129 real cases in China over an eight-year period (2015-2022).
View Article and Find Full Text PDFN Engl J Med
September 2025
Rwanda Biomedical Center, Kigali.
Background: On September 27, 2024, Rwanda reported an outbreak of Marburg virus disease (MVD), after a cluster of cases of viral hemorrhagic fever was detected at two urban hospitals.
Methods: We report key aspects of the epidemiology, clinical manifestations, and treatment of MVD during this outbreak, as well as the overall response to the outbreak. We performed a retrospective epidemiologic and clinical analysis of data compiled across all pillars of the outbreak response and a case-series analysis to characterize clinical features, disease progression, and outcomes among patients who received supportive care and investigational therapeutic agents.
JMIR Med Inform
September 2025
College of Medical Informatics, Chongqing Medical University, 1 Yixueyuan Road, Yuzhong District, Chongqing, 400016, China, 86 13500303273.
Background: Cirrhosis is a leading cause of noncancer deaths in gastrointestinal diseases, resulting in high hospitalization and readmission rates. Early identification of high-risk patients is vital for proactive interventions and improving health care outcomes. However, the quality and integrity of real-world electronic health records (EHRs) limit their utility in developing risk assessment tools.
View Article and Find Full Text PDF