98%
921
2 minutes
20
The advancement of deep learning has driven extensive research validating the effectiveness of U-Net-style symmetric encoder-decoder architectures based on Transformers for medical image segmentation. However, the inherent design requiring attention mechanisms to compute token affinities across all spatial locations leads to prohibitive computational complexity and substantial memory demands. Recent efforts have attempted to address these limitations through sparse attention mechanisms. However, existing approaches employing artificial, content-agnostic sparse attention patterns demonstrate limited capability in modeling long-range dependencies effectively. We propose MFFBi-Unet, a novel architecture incorporating dynamic sparse attention through bi-level routing, enabling context-aware computation allocation with enhanced adaptability. The encoder-decoder module integrates BiFormer to optimize semantic feature extraction and facilitate high-fidelity feature map reconstruction. A novel Multi-scale Feature Fusion (MFF) module in skip connections synergistically combines multi-level contextual information with processed multi-scale features. Extensive evaluations on multiple public medical benchmarks demonstrate that our method consistently exhibits significant advantages. Notably, our method achieves statistically significant improvements, outperforming state-of-the-art approaches like MISSFormer by 2.02% and 1.28% Dice scores on respective benchmarks.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s12539-025-00740-4 | DOI Listing |
Bioinspir Biomim
September 2025
University of Shanghai for Science and Technology, shanghai, Shanghai, Shanghai, 200093, CHINA.
Collective intelligence in biological groups can be employed to inspire the control of artificial complex systems, such as swarm robotics. However, modeling for the social interactions between individuals is still a challenging task. Without loss of generality, we propose a deep attention network model that incorporates the principles of biological Hard Attention mechanisms, that means an individual only pay attention to one or two neighbors for collective motion decision in large group.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
September 2025
Accurate vascular segmentation is essential for coronary visualization and the diagnosis of coronary heart disease. This task involves the extraction of sparse tree-like vascular branches from volumetric space. However, existing methods have faced significant challenges due to discontinuous vascular segmentation and missing endpoints.
View Article and Find Full Text PDFPLoS One
September 2025
School of Electrical and Information Engineering, Hunan Institute of Technology, Hengyang, Hunan, China.
Knowledge tracing can reveal students' level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies.
View Article and Find Full Text PDFPhotosynth Res
September 2025
College of Life Sciences, Shanghai Normal University, Shanghai, 200235, China.
Euglena sanguinea (Ehrenberg 1831) is one of the earliest reported species within the genus Euglena. Its prolific proliferation leading to red algal bloom has garnered significant scientific attention due to its ecological and environmental impacts. Despite this, research on E.
View Article and Find Full Text PDFPLoS One
September 2025
School of Mechanical and Electrical Engineering, China University of Mining and Technology (Beijing), Beijing, China.
Multi-modal data fusion plays a critical role in enhancing the accuracy and robustness of perception systems for autonomous driving, especially for the detection of small objects. However, small object detection remains particularly challenging due to sparse LiDAR points and low-resolution image features, which often lead to missed or imprecise detections. Currently, many methods process LiDAR point clouds and visible-light camera images separately, and then fuse them in the detection head.
View Article and Find Full Text PDF