Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Over the past decade, information for precision disease medicine has accumulated in the form of textual data. To effectively utilize this expanding medical text, we proposed a multi-task learning-based framework based on hard parameter sharing for knowledge graph construction (MKG), and then used it to automatically extract gastric cancer (GC)-related biomedical knowledge from the literature and identify GC drug candidates. In MKG, we designed three separate modules, MT-BGIPN, MT-SGTF and MT-ScBERT, for entity recognition, entity normalization, and relation classification, respectively. To address the challenges posed by the long and irregular naming of medical entities, the MT-BGIPN utilized bidirectional gated recurrent unit and interactive pointer network techniques, significantly improving entity recognition accuracy to an average F1 value of 84.5% across datasets. In MT-SGTF, we employed the term frequency-inverse document frequency and the gated attention unit. These combine both semantic and characteristic features of entities, resulting in an average Hits@ 1 score of 94.5% across five datasets. The MT-ScBERT integrated cross-text, entity, and context features, yielding an average F1 value of 86.9% across 11 relation classification datasets. Based on the MKG, we then developed a specific knowledge graph for GC (MKG-GC), which encompasses a total of 9129 entities and 88,482 triplets. Lastly, the MKG-GC was used to predict potential GC drugs using a pre-trained language model called BioKGE-BERT and a drug-disease discriminant model based on CNN-BiLSTM. Remarkably, nine out of the top ten predicted drugs have been previously reported as effective for gastric cancer treatment. Finally, an online platform was created for exploration and visualization of MKG-GC at https://www.yanglab-mi.org.cn/MKG-GC/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10995799PMC
http://dx.doi.org/10.1016/j.csbj.2024.03.021DOI Listing

Publication Analysis

Top Keywords

knowledge graph
12
gastric cancer
12
multi-task learning-based
8
graph construction
8
entity recognition
8
relation classification
8
mkg-gc
4
mkg-gc multi-task
4
knowledge
4
learning-based knowledge
4

Similar Publications

Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges represented by feature heterogeneity and structural heterogeneity. Recent efforts have been made to address feature heterogeneity via Large Language Models (LLMs) on text-attributed graphs (TAGs) by generating fixed-length text representations as node features.

View Article and Find Full Text PDF

Introduction Simulation-based training has been a vital part of medical education since Competency-Based Medical Education (CBME) was introduced, and new guidelines since 2023 have expanded to include simulation as a mandatory methodology of teaching. This method enables learners to build and develop both technical and non-technical abilities in a safe and controlled setting, enhancing their preparedness for real-life medical scenarios. Simulation-based training improves skill acquisition and retention and enhances learners' confidence, reduces anxiety, reinforces learning, corrects errors, and promotes reflective practice, in contrast with the traditional method of teaching.

View Article and Find Full Text PDF

Drug-target interaction (DTI) prediction is essential for the development of novel drugs and the repurposing of existing ones. However, when the features of drug and target are applied to biological networks, there is a lack of capturing the relational features of drug-target interactions. And the corresponding multimodal models mainly depend on shallow fusion strategies, which results in suboptimal performance when trying to capture complex interaction relationships.

View Article and Find Full Text PDF

A Python-scripted software tool has been developed to help study the heterogeneity of gene changes, markedly or moderately expressed, when several experimental conditions are compared. The analysis workflow encloses a scorecard that groups genes based on relative fold-change and statistical significance, providing additional functions that facilitate knowledge extraction. The scorecard reports highlight unique patterns of gene regulation, such as genes whose expression is consistently up- or down-regulated across experiments, all of which are supported by graphs and summaries to characterize the dataset under investigation.

View Article and Find Full Text PDF

Spatial transcriptomics (ST) reveals gene expression distributions within tissues. Yet, predicting spatial gene expression from histological images still faces the challenges of limited ST data that lack prior knowledge, and insufficient capturing of inter-slice heterogeneity and intra-slice complexity. To tackle these challenges, we introduce FmH2ST, a foundation model-based method for spatial gene expression prediction.

View Article and Find Full Text PDF