Lightweight Multi-Stage Aggregation Transformer for robust medical image segmentation.

Xiaoyan Wang , Yating Zhu , Ying Cui , Xiaojie Huang , Dongyan Guo , Pan Mu , Ming Xia , Cong Bai , Zhongzhao Teng , Shengyong Chen

Med Image Anal

School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China.

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Capturing rich multi-scale features is essential to address complex variations in medical image segmentation. Multiple hybrid networks have been developed to integrate the complementary benefits of convolutional neural networks (CNN) and Transformers. However, existing methods may suffer from either huge computational cost required by the complicated networks or unsatisfied performance of lighter networks. How to give full play to the advantages of both convolution and self-attention and design networks that are both effective and efficient still remains an unsolved problem. In this work, we propose a robust lightweight multi-stage hybrid architecture, named Multi-stage Aggregation Transformer version 2 (MA-TransformerV2), to extract multi-scale features with progressive aggregations for accurate segmentation of highly variable medical images at a low computational cost. Specifically, lightweight Trans blocks and lightweight CNN blocks are parallelly introduced into the dual-branch encoder module in each stage, and then a vector quantization block is incorporated at the bottleneck to discretizes the features and discard the redundance. This design not only enhances the representation capabilities and computational efficiency of the model, but also makes the model interpretable. Extensive experimental results on public datasets show that our method outperforms state-of-the-art methods, including CNN-based, Transformer-based, advanced hybrid CNN-Transformer-based models, and several lightweight models, in terms of both segmentation accuracy and model capacity. Code will be made publicly available at https://github.com/zjmiaprojects/MATransformerV2.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.media.2025.103569	DOI Listing

Publication Analysis

Top Keywords

lightweight multi-stage

multi-stage aggregation

aggregation transformer

medical image

image segmentation

multi-scale features

computational cost

lightweight

networks

transformer robust

Similar Publications

MV2SwimNet: A lightweight transformer-based hybrid model for knee meniscus tears detection.

PLoS One

August 2025

Department of Electronics and Communication Engineering, Kuwait College of Science and Technology (KCST), Doha Area, Kuwait.

Vishesh Tanwar , Bhisham Sharma , Dhirendra Prasad Yadav , Julian L Webber , Abolfazl Mehbodniya

Knee Ailments, such as meniscus injuries, bother millions globally, with research showing that more than 14% of the population above 40 years lives with meniscus-related conditions. Conventional diagnosis techniques, like manual MRI interpretation, are labour-intensive, error-prone, and dependent on skilled radiologists, making an automatic and more accurate alternative indispensable. Current deep-learning solutions heavily depend on CNNs, which perform poorly in long-range dependencies and global contextual info.

View Article and Find Full Text PDF

Similar Publications

A Lightweight Multi-Stage Visual Detection Approach for Complex Traffic Scenes.

Sensors (Basel)

August 2025

School of Electronic Information and Electrical Engineering, Yangtze University, Jingzhou 434023, China.

Xuanyi Zhao , Xiaohan Dou , Jihong Zheng , Gengpei Zhang

In complex traffic environments, image degradation due to adverse factors such as haze, low illumination, and occlusion significantly compromises the performance of object detection systems in recognizing vehicles and pedestrians. To address these challenges, this paper proposes a robust visual detection framework that integrates multi-stage image enhancement with a lightweight detection architecture. Specifically, an image preprocessing module incorporating ConvIR and CIDNet is designed to perform defogging and illumination enhancement, thereby substantially improving the perceptual quality of degraded inputs.

View Article and Find Full Text PDF

Similar Publications

Fast Anomaly Detection for Vision-Based Industrial Inspection Using Cascades of Null Subspace PCA Detectors.

Sensors (Basel)

August 2025

Department of Electrical and Computer Engineering, College of Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia.

Muhammad Bilal , Muhammad Shehzad Hanif

Anomaly detection in industrial imaging is critical for ensuring quality and reliability in automated manufacturing processes. While recently several methods have been reported in the literature that have demonstrated impressive detection performance on standard benchmarks, they necessarily rely on computationally intensive CNN architectures and post-processing techniques, necessitating access to high-end GPU hardware and limiting practical deployment in resource-constrained settings. In this study, we introduce a novel anomaly detection framework that leverages feature maps from a lightweight convolutional neural network (CNN) backbone, MobileNetV2, and cascaded detection to achieve notable accuracy as well as computational efficiency.

View Article and Find Full Text PDF

Similar Publications

Multispectral polarization image demosaicing using redundant Stokes representation.

Appl Opt

February 2025

Kazuma Shinoda , Tomoharu Ishiuchi

This paper proposes a deep-learning-based demosaicing algorithm, multispectral polarization demosaicing with redundant Stokes (MPD-RS), designed for multispectral polarization filter arrays. The proposed MPD-RS effectively learns the correlation across spatial, spectral, and polarization domains, utilizing a newly constructed dataset of multispectral polarization images (MSPIs). Initially, MPD-RS performs interpolation using a position-variant convolutional kernel to generate a preliminary MSPI.

View Article and Find Full Text PDF

Similar Publications

Realization of extremely narrow divergence angle and ground test method toward quantum key distribution based on a medium-high orbit satellite.

Appl Opt

July 2025

Jincai Wu , Chenglin Zhou , Yongjian Tan , Zhihua Song , Zhiping He

The medium-high orbit quantum science experimental satellite is designed to conduct quantum communication experiments over distances of 10,000 km both during daylight and at night to establish a global quantum communication network and enable multiple quantum experiments. To ensure efficient links over such vast distances, a beam divergence of 3 µrad at 850 nm is required-posing significant challenges in designing lightweight, large-aperture telescope systems and detecting ultra-narrow divergence angles. Here, through analysis using a far-field diffraction model, we determined that the optimal aperture should be 660 mm and that the RMS wavefront aberration must be controlled to 1/9 at 632.

View Article and Find Full Text PDF

Similar Publications