An Innovative and Efficient Diagnostic Prediction Flow for Head and Neck Cancer: A Deep Learning Approach for Multi-Modal Survival Analysis Prediction Based on Text and Multi-Center PET/CT Images.

Zhaonian Wang , Chundan Zheng , Xu Han , Wufan Chen , Lijun Lu

Diagnostics (Basel)

School of Biomedical Engineering, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China.

Published: February 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

To comprehensively capture intra-tumor heterogeneity in head and neck cancer (HNC) and maximize the use of valid information collected in the clinical field, we propose a novel multi-modal image-text fusion strategy aimed at improving prognosis. We have developed a tailored diagnostic algorithm for HNC, leveraging a deep learning-based model that integrates both image and clinical text information. For the image fusion part, we used the cross-attention mechanism to fuse the image information between PET and CT, and for the fusion of text and image, we used the Q-former architecture to fuse the text and image information. We also improved the traditional prognostic model by introducing time as a variable in the construction of the model, and finally obtained the corresponding prognostic results. We assessed the efficacy of our methodology through the compilation of a multicenter dataset, achieving commendable outcomes in multicenter validations. Notably, our results for metastasis-free survival (MFS), recurrence-free survival (RFS), overall survival (OS), and progression-free survival (PFS) were as follows: 0.796, 0.626, 0.641, and 0.691. Our results demonstrate a notable superiority over the utilization of CT and PET independently, and exceed the result derived without the clinical textual information. Our model not only validates the effectiveness of multi-modal fusion in aiding diagnosis, but also provides insights for optimizing survival analysis. The study underscores the potential of our approach in enhancing prognosis and contributing to the advancement of personalized medicine in HNC.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10888043	PMC
http://dx.doi.org/10.3390/diagnostics14040448	DOI Listing

Publication Analysis

Top Keywords

text image

head neck

neck cancer

survival analysis

survival

image

innovative efficient

efficient diagnostic

diagnostic prediction

prediction flow

Similar Publications

Temporal Modeling With Frozen Vision-Language Foundation Models for Parameter-Efficient Text-Video Retrieval.

IEEE Trans Neural Netw Learn Syst

September 2025

Leqi Shen , Tianxiang Hao , Tao He , Yifeng Zhang , Pengzhang Liu

Temporal modeling plays an important role in the effective adaption of the powerful pretrained text-image foundation model into text-video retrieval. However, existing methods often rely on additional heavy trainable modules, such as transformer or BiLSTM, which are inefficient. In contrast, we avoid introducing such heavy components by leveraging frozen foundation models.

View Article and Find Full Text PDF

Similar Publications

Investigation of the potential of repurposing waste disposals into concretes: mechanical properties, reduction in cooling/heating energy costs, and carbon exudation mitigation prospective.

Environ Sci Pollut Res Int

September 2025

Vellore Institute of Technology, Vellore, 632014, Tamil Nadu, India.

Abin Roy , Saboor Shaik

The significant global energy consumption strongly emphasizes the crucial role of net-zero or green structures in ensuring a sustainable future. Considering this aspect, incorporating thermal insulation materials into building components is a well-accepted method that helps to enhance thermal comfort in buildings. Furthermore, integrating architectural components made from solid refuse materials retrieved from the environment can have significant environmental benefits.

View Article and Find Full Text PDF

Similar Publications

Assessing the ability of large language models to simplify lumbar spine imaging reports into patient-facing text: a pilot study of GPT-4.

Skeletal Radiol

September 2025

Department of Orthopaedic Surgery, Northwestern University, Chicago, IL, USA.

Rushmin Khazanchi , Austin R Chen , Parth Desai , Daniel Herrera , Jacob R Staub

Objective: To assess the ability of large language models (LLMs) to accurately simplify lumbar spine magnetic resonance imaging (MRI) reports.

Materials And Methods: Patients who underwent lumbar decompression and/or fusion surgery in 2022 at one tertiary academic medical center were queried using appropriate CPT codes. We then identified all patients with a preoperative ICD diagnosis of lumbar spondylolisthesis and extracted the latest preoperative spine MRI radiology report text.

View Article and Find Full Text PDF

Similar Publications

Open revascularization for infrainguinal peripheral arterial disease in elderly patients: A scoping review.

Semin Vasc Surg

September 2025

Division of Vascular and Endovascular Surgery, Department of Surgery, Northwell Health, Manhasset, NY; Zucker School of Medicine at Hofstra, Hempstead, NY. Electronic address:

Grace Yu , Yana Etkin , Jeffrey Silpe

Peripheral arterial disease (PAD) is a prevalent and debilitating condition in elderly patients, often leading to critical limb threatening ischemia (CLTI) and major amputations. While endovascular interventions are usually preferred for their lower perioperative risk, open surgical revascularization should also be considered due to its durability and superior patency in complex disease patterns. Age alone does not determine suitability for surgery; rather, candidacy hinges on frailty, functional status, comorbidities, and anatomical considerations.

View Article and Find Full Text PDF

Similar Publications

Dual aggregation based joint-modal similarity hashing for cross-modal retrieval.

Neural Netw

September 2025

Shanghai Maritime University, Shanghai, 201306, China. Electronic address:

Le Xu , Jun Yin

Cross-modal hashing aims to leverage hashing functions to map multimodal data into a unified low-dimensional space, realizing efficient cross-modal retrieval. In particular, unsupervised cross-modal hashing methods attract significant attention for not needing external label information. However, in the field of unsupervised cross-modal hashing, there are several pressing issues to address: (1) how to facilitate semantic alignment between modalities, and (2) how to effectively capture the intrinsic relationships between data, thereby constructing a more reliable affinity matrix to assist in the learning of hash codes.

View Article and Find Full Text PDF

Similar Publications