98%
921
2 minutes
20
Recent developments underscore the potential of textual information in enhancing learning models for a deeper understanding of medical visual semantics. However, language-guided medical image segmentation still faces a challenging issue. Previous works employ implicit architectures to embed textual information. This leads to segmentation results that are inconsistent with the semantics represented by the language, sometimes even diverging significantly. To this end, we propose a novel cross-modal conditioned Reconstruction for Language-guided Medical Image Segmentation (RecLMIS) to explicitly capture cross-modal interactions, which assumes that well-aligned medical visual features and medical notes can effectively reconstruct each other. We introduce conditioned interaction to adaptively predict patches and words of interest. Subsequently, they are utilized as conditioning factors for mutual reconstruction to align with regions described in the medical notes. Extensive experiments demonstrate the superiority of our RecLMIS, surpassing LViT by 3.74% mIoU on the MosMedData+ dataset and 1.89% mIoU on the QATA-CoV19 dataset. More importantly, we achieve a relative reduction of 20.2% in parameter count and a 55.5% decrease in computational load. The code will be available at https://github.com/ShawnHuang497/RecLMIS.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TMI.2024.3523333 | DOI Listing |
Comput Biol Med
August 2025
Department of Radiation Oncology, UTSW, United States of America. Electronic address:
Accurate prediction of head and neck cancer recurrence across medical institutions remains challenging due to inherent domain shifts in imaging data. Current domain generalization methods primarily focus on learning domain-invariant features from medical images, often overlooking structured clinical information that inherently exhibits cross-institutional consistency. To leverage clinical data and enhance the model's generalization, we propose an end-to-end Language-Guided Multimodal Domain Generalization (LGMDG) method.
View Article and Find Full Text PDFJ Biomed Inform
May 2025
School of Biomedical Engineering, Capital Medical University, Beijing, China; Laboratory for Clinical Medicine, Capital Medical University, Beijing, China; Beijing Key Laboratory of Fundamicationental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China. Elect
Medical Visual Question Answering (Med-VQA) is a critical multimodal task with the potential to address the scarcity and imbalance of medical resources. However, most existing studies overlook the limitations of the inconsistency in information density between medical images and text, as well as the long-tail distribution in datasets, which continue to make Med-VQA an open challenge. To overcome these issues, this study proposes a Language-Guided Progressive Fusion Network (LGPFN) with three key modules: Question-Guided Progressive Multimodal Fusion (QPMF), Language-Gate Mechanism (LGM), and Triple Semantic Feature Alignment (TriSFA).
View Article and Find Full Text PDFIEEE Trans Med Imaging
April 2025
Recent developments underscore the potential of textual information in enhancing learning models for a deeper understanding of medical visual semantics. However, language-guided medical image segmentation still faces a challenging issue. Previous works employ implicit architectures to embed textual information.
View Article and Find Full Text PDFNat Methods
July 2023
Center for Quantitative Cell Imaging, University of Wisconsin-Madison, Madison, WI, USA.
We dream of a future where light microscopes have new capabilities: language-guided image acquisition, automatic image analysis based on extensive prior training from biologist experts, and language-guided image analysis for custom analyses. Most capabilities have reached the proof-of-principle stage, but implementation would be accelerated by efforts to gather appropriate training sets and make user-friendly interfaces.
View Article and Find Full Text PDFJAMA Netw Open
August 2020
Health Psychology and Clinical Science Program, The Graduate Center, City University of New York, New York.
Importance: Youth living with HIV make up one-quarter of new infections and have high rates of risk behaviors but are significantly understudied. Effectiveness trials in real-world settings are needed to inform program delivery.
Objective: To compare the effectiveness of the Healthy Choices intervention delivered in a home or community setting vs a medical clinic.