A Pure Transformer Pretraining Framework on Text-attributed Graphs.

Yu Song , Haitao Mao , Jiachen Xiao , Jingzhe Liu , Zhikai Chen , Wei Jin , Carl Yang , Jiliang Tang , Hui Liu

Proc Mach Learn Res

Michigan State University.

Published: November 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges represented by feature heterogeneity and structural heterogeneity. Recent efforts have been made to address feature heterogeneity via Large Language Models (LLMs) on text-attributed graphs (TAGs) by generating fixed-length text representations as node features. These high-quality features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified feature space to learn refined interaction patterns that generalizes across graphs. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walk and employs masked feature reconstruction to capture pairwise proximity in the LLM-unified feature space using a standard Transformer. By utilizing unified text representations rather than varying structures, GSPT alleviates structural heterogeneity and achieves significantly better transferability among graphs within the same domain. Our approach can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets. The source code is publicly available at https://github.com/SongYYYY/GSPT.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416796	PMC

Publication Analysis

Top Keywords

text-attributed graphs

feature heterogeneity

structural heterogeneity

text representations

graph structure

feature space

graph

feature

pure transformer

pretraining

Similar Publications

A Pure Transformer Pretraining Framework on Text-attributed Graphs.

Proc Mach Learn Res

November 2024

Michigan State University.

Yu Song , Haitao Mao , Jiachen Xiao , Jingzhe Liu , Zhikai Chen

View Article and Find Full Text PDF

Similar Publications

A time-frequency graph fusion framework for Major Depressive Disorder diagnosis in multi-site rsfMRI data.

J Affect Disord

September 2025

College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning, China. Electronic address:

Shiyue Su , Zhihan Feng , Ting Mei , Qianqian Li , Hongbo Zhu

Major Depressive Disorder (MDD) poses a significant global health threat, impairing individual functioning and increasing socioeconomic burden. Accurate diagnosis is crucial for improving treatment outcomes. This study proposes Time-Frequency Text-Attributed DeepWalk (TF-TADW), a framework for MDD classification using resting-state functional MRI data.

View Article and Find Full Text PDF

Similar Publications

A generalized LLMs framework to support public health financing through probabilistic predictions and uncertainty quantification.

Artif Intell Med

October 2025

Public Health Wales, United Kingdom.

Daniele Guariso , Rilwan Adewoyin , Gisela Robles Aguilar , Omar A Guerrero , Alisha Davies

As a systemic problem, public health cannot be addressed without considering other policy dimensions. Hence, a holistic approach across public policy areas is necessary to incorporate Health-for-All values into decision-making. However, such multisectoral interventions require public budgets that are effectively mapped into public health outcomes and indicators of their wider determinants.

View Article and Find Full Text PDF

Similar Publications

Multilevel Context Learning with Large Language Models for Text-Attributed Graphs on Social Networks.

Entropy (Basel)

March 2025

Electronic Information School, Wuhan University, Wuhan 430072, China.

Xiaokang Cai , Ruoyuan Gong , Hao Jiang

There are complex graph structures and rich textual information on social networks. Text provides important information for various tasks, while graph structures offer multilevel context for the semantics of the text. Contemporary researchers tend to represent these kinds of data by text-attributed graphs (TAGs).

View Article and Find Full Text PDF

Similar Publications

BioMedGraphica: An All-in-One Platform for Biomedical Prior Knowledge and Omic Signaling Graph Generation.

bioRxiv

December 2024

Institute for Informatics, Data Science and Biostatistics (I2DB), Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA.

Heming Zhang , Shunning Liang , Tim Xu , Wenyu Li , Di Huang

Artificial intelligence (AI) is revolutionizing scientific discovery because of its super capability, following the neural scaling laws, to integrate and analyze large-scale datasets to mine knowledge. Foundation models, large language models (LLMs) and large vision models (LVMs), are among the most important foundations paving the way for general AI by pre-training on massive domain-specific datasets. Different from the well annotated, formatted and integrated large textual and image datasets for LLMs and LVMs, biomedical knowledge and datasets are fragmented with data scattered across publications and inconsistent databases that often use diverse nomenclature systems in the field of AI for Precision Health and Medicine (AI4PHM).

View Article and Find Full Text PDF

Similar Publications