Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The advancement of deep learning has driven extensive research validating the effectiveness of U-Net-style symmetric encoder-decoder architectures based on Transformers for medical image segmentation. However, the inherent design requiring attention mechanisms to compute token affinities across all spatial locations leads to prohibitive computational complexity and substantial memory demands. Recent efforts have attempted to address these limitations through sparse attention mechanisms. However, existing approaches employing artificial, content-agnostic sparse attention patterns demonstrate limited capability in modeling long-range dependencies effectively. We propose MFFBi-Unet, a novel architecture incorporating dynamic sparse attention through bi-level routing, enabling context-aware computation allocation with enhanced adaptability. The encoder-decoder module integrates BiFormer to optimize semantic feature extraction and facilitate high-fidelity feature map reconstruction. A novel Multi-scale Feature Fusion (MFF) module in skip connections synergistically combines multi-level contextual information with processed multi-scale features. Extensive evaluations on multiple public medical benchmarks demonstrate that our method consistently exhibits significant advantages. Notably, our method achieves statistically significant improvements, outperforming state-of-the-art approaches like MISSFormer by 2.02% and 1.28% Dice scores on respective benchmarks.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12539-025-00740-4DOI Listing

Publication Analysis

Top Keywords

sparse attention
16
dynamic sparse
8
multi-scale feature
8
feature fusion
8
medical image
8
image segmentation
8
attention mechanisms
8
attention
5
mffbi-unet merging
4
merging dynamic
4

Similar Publications

Collective intelligence in biological groups can be employed to inspire the control of artificial complex systems, such as swarm robotics. However, modeling for the social interactions between individuals is still a challenging task. Without loss of generality, we propose a deep attention network model that incorporates the principles of biological Hard Attention mechanisms, that means an individual only pay attention to one or two neighbors for collective motion decision in large group.

View Article and Find Full Text PDF

Accurate vascular segmentation is essential for coronary visualization and the diagnosis of coronary heart disease. This task involves the extraction of sparse tree-like vascular branches from volumetric space. However, existing methods have faced significant challenges due to discontinuous vascular segmentation and missing endpoints.

View Article and Find Full Text PDF

Knowledge tracing can reveal students' level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies.

View Article and Find Full Text PDF

Euglena sanguinea (Ehrenberg 1831) is one of the earliest reported species within the genus Euglena. Its prolific proliferation leading to red algal bloom has garnered significant scientific attention due to its ecological and environmental impacts. Despite this, research on E.

View Article and Find Full Text PDF

Multi-modal data fusion plays a critical role in enhancing the accuracy and robustness of perception systems for autonomous driving, especially for the detection of small objects. However, small object detection remains particularly challenging due to sparse LiDAR points and low-resolution image features, which often lead to missed or imprecise detections. Currently, many methods process LiDAR point clouds and visible-light camera images separately, and then fuse them in the detection head.

View Article and Find Full Text PDF