Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Graph neural networks (GNNs) have become a popular approach for semi-supervised graph representation learning. GNNs research has generally focused on improving methodological details, whereas less attention has been paid to exploring the importance of labeling the data. However, for semi-supervised learning, the quality of training data is vital. In this paper, we first introduce and elaborate on the problem of training data selection for GNNs. More specifically, focusing on node classification, we aim to select representative nodes from a graph used to train GNNs to achieve the best performance. To solve this problem, we are inspired by the popular lottery ticket hypothesis, typically used for sparse architectures, and we propose the following subset hypothesis for graph data: "There exists a core subset when selecting a fixed-size dataset from the dense training dataset, that can represent the properties of the dataset, and GNNs trained on this core subset can achieve a better graph representation". Equipped with this subset hypothesis, we present an efficient algorithm to identify the core data in the graph for GNNs. Extensive experiments demonstrate that the selected data (as a training set) can obtain performance improvements across various datasets and GNNs architectures.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2024.106635DOI Listing

Publication Analysis

Top Keywords

graph neural
8
neural networks
8
training data
8
subset hypothesis
8
core subset
8
graph
7
gnns
7
data
6
finding core
4
core labels
4

Similar Publications

Large language models (LLMs) have demonstrated transformative potential for materials discovery in condensed matter systems, but their full utility requires both broader application scenarios and integration with ab initio crystal structure prediction (CSP), density functional theory (DFT) methods and domain knowledge to benefit future inverse material design. Here, we develop an integrated computational framework combining language model-guided materials screening with genetic algorithm (GA) and graph neural network (GNN)-based CSP methods to predict new photovoltaic material. This LLM + CSP + DFT approach successfully identifies a previously overlooked oxide material with unexpected photovoltaic potential.

View Article and Find Full Text PDF

Development of Coarse-Grained Lipid Force Fields Based on a Graph Neural Network.

J Chem Theory Comput

September 2025

Department of Materials Science and Engineering, City University of Hong Kong, Kowloon 999077, Hong Kong China.

Coarse-grained (CG) lipid models enable efficient simulations of large-scale membrane events. However, achieving both speed and atomic-level accuracy remains challenging. Graph neural networks (GNNs) trained on all-atom (AA) simulations can serve as CG force fields, which have demonstrated success in CG simulations of proteins.

View Article and Find Full Text PDF

Hubs, influencers, and communities of executive functions: a task-based fMRI graph analysis.

Front Hum Neurosci

August 2025

Baptist Medical Center, Department of Behavioral Health, Jacksonville, FL, United States.

Introduction: This study investigates four subdomains of executive functioning-initiation, cognitive inhibition, mental shifting, and working memory-using task-based functional magnetic resonance imaging (fMRI) data and graph analysis.

Methods: We used healthy adults' functional magnetic resonance imaging (fMRI) data to construct brain connectomes and network graphs for each task and analyzed global and node-level graph metrics.

Results: The bilateral precuneus and right medial prefrontal cortex emerged as pivotal hubs and influencers, emphasizing their crucial regulatory role in all four subdomains of executive function.

View Article and Find Full Text PDF

Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges represented by feature heterogeneity and structural heterogeneity. Recent efforts have been made to address feature heterogeneity via Large Language Models (LLMs) on text-attributed graphs (TAGs) by generating fixed-length text representations as node features.

View Article and Find Full Text PDF

Phenotype-driven approaches identify disease-counteracting compounds by analysing the phenotypic signatures that distinguish diseased from healthy states. Here we introduce PDGrapher, a causally inspired graph neural network model that predicts combinatorial perturbagens (sets of therapeutic targets) capable of reversing disease phenotypes. Unlike methods that learn how perturbations alter phenotypes, PDGrapher solves the inverse problem and predicts the perturbagens needed to achieve a desired response by embedding disease cell states into networks, learning a latent representation of these states, and identifying optimal combinatorial perturbations.

View Article and Find Full Text PDF