Cross-Modal Distillation to Improve MRI-Based Brain Tumor Segmentation With Missing MRI Sequences.

Masoomeh Rahimpour , Jeroen Bertels , Ahmed Radwan , Henri Vandermeulen , Stefan Sunaert , Dirk Vandermeulen , Frederik Maes , Karolien Goffin , Michel Koole

IEEE Trans Biomed Eng

Published: July 2022

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Convolutional neural networks (CNNs) for brain tumor segmentation are generally developed using complete sets of magnetic resonance imaging (MRI) sequences for both training and inference. As such, these algorithms are not trained for realistic, clinical scenarios where parts of the MRI sequences which were used for training, are missing during inference. To increase clinical applicability, we proposed a cross-modal distillation approach to leverage the availability of multi-sequence MRI data for training and generate an enriched CNN model which uses only single-sequence MRI data for inference but outperforms a single-sequence CNN model. We assessed the performance of the proposed method for whole tumor and tumor core segmentation with multi-sequence MRI data available for training but only T-weighted ([Formula: see text]) sequence data available for inference, using BraTS 2018, and in-house datasets. Results showed that cross-modal distillation significantly improved the Dice score for both whole tumor and tumor core segmentation when only [Formula: see text] sequence data were available for inference. For the evaluation using the in-house dataset, cross-modal distillation achieved an average Dice score of 79.04% and 69.39% for whole tumor and tumor core segmentation, respectively, while a single-sequence U-Net model using [Formula: see text] sequence data for both training and inference achieved an average Dice score of 73.60% and 62.62%, respectively. These findings confirmed cross-modal distillation as an effective method to increase the potential of single-sequence CNN models such that segmentation performance is less compromised by missing MRI sequences or having only one MRI sequence available for segmentation.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TBME.2021.3137561	DOI Listing

Publication Analysis

Top Keywords

cross-modal distillation

mri sequences

mri data

data training

data inference

tumor tumor

tumor core

core segmentation

[formula text]

text] sequence

Similar Publications

A Cross-Modal Mutual Knowledge Distillation Framework for Alzheimer's Disease Diagnosis: Addressing Incomplete Modalities.

IEEE Trans Autom Sci Eng

March 2025

H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA.

Min Gu Kwak , Lingchao Mao , Zhiyang Zheng , Yi Su , Fleming Lure

Early detection of Alzheimer's Disease (AD) is crucial for timely interventions and optimizing treatment outcomes. Integrating multimodal neuroimaging datasets can enhance the early detection of AD. However, models must address the challenge of incomplete modalities, a common issue in real-world scenarios, as not all patients have access to all modalities due to practical constraints such as cost and availability.

View Article and Find Full Text PDF

Similar Publications

DistilCLIP-EEG: Enhancing Epileptic Seizure Detection Through Multi-modal Learning and Knowledge Distillation.

IEEE J Biomed Health Inform

August 2025

Zexin Wang , Lin Shi , Haoyu Wu , Junru Luo , Xiangzeng Kong

Epilepsy is a prevalent neurological disorder marked by sudden, brief episodes of excessive neuronal activity caused by abnormal electrical discharges, which may lead to some mental disorders. Most existing deep learning methods for epilepsy detection rely solely on unimodal EEG signals, neglecting the potential benefits of multimodal information. To address this, we propose a novel multimodal model, DistilCLIP-EEG, based on the CLIP framework, which integrates both EEG signals and text descriptions to capture comprehensive features of epileptic seizures.

View Article and Find Full Text PDF

Similar Publications

Cross-modal gated feature enhancement for multimodal emotion recognition in conversations.

Sci Rep

August 2025

Department of Economics and Management, Suzhou Chien-Shiung Institute of Technology, Suzhou, 215411, China.

Shiyun Zhao , Jinchang Ren , Xiaojuan Zhou

Emotion recognition in conversations (ERC), which involves identifying the emotional state of each utterance within a dialogue, plays a vital role in developing empathetic artificial intelligence systems. In practical applications, such as video-based recruitment interviews, customer service, health monitoring, intelligent personal assistants, and online education, ERC can facilitate the analysis of emotional cues, improve decision-making processes, and enhance user interaction and satisfaction. Current multimodal emotion recognition research faces several challenges, such as ineffective emotional information extraction from single modalities, underused complementary features, and inter-modal redundancy.

View Article and Find Full Text PDF

Similar Publications

Multimodal Fusion of Behavioral and Physiological Signals for Enhanced Emotion Recognition Via Feature Decoupling and Knowledge Transfer.

IEEE J Biomed Health Inform

August 2025

Hongxiang Gao , Zhipeng Cai , Xingyao Wang , Min Wu , Chengyu Liu

Multimodal emotion recognition has emerged as a promising direction for capturing the complexity of human affective states by integrating physiological and behavioral signals. However, challenges remain in addressing feature redundancy, modality heterogeneity, and insufficient inter-modal supervision. In this paper, we propose a novel Multimodal Disentangled Knowledge Distillation framework that explicitly disentangles modality-shared and modality-specific features and enhances cross-modal knowledge transfer via a graph-based distillation module.

View Article and Find Full Text PDF

Similar Publications

Scaling Chest X-ray Foundation Models from Mixed Supervisions for Dense Prediction.

IEEE Trans Med Imaging

July 2025

Fuying Wang , Lequan Yu

Foundation models have significantly revolutionized the field of chest X-ray diagnosis with their ability to transfer across various diseases and tasks. However, previous works have predominantly utilized self-supervised learning from medical image-text pairs, which falls short in dense medical prediction tasks due to their sole reliance on such coarse pair supervision, thereby limiting their applicability to detailed diagnostics. In this paper, we introduce a Dense Chest X-ray Foundation Model (DCXFM), which utilizes mixed supervision types (i.

View Article and Find Full Text PDF

Similar Publications