MMT: Cross Domain Few-Shot Learning via Meta-Memory Transfer.

Wenjian Wang , Lijuan Duan , Yuxi Wang , Junsong Fan , Zhaoxiang Zhang

IEEE Trans Pattern Anal Mach Intell

Published: December 2023

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Few-shot learning aims to recognize novel categories solely relying on a few labeled samples, with existing few-shot methods primarily focusing on the categories sampled from the same distribution. Nevertheless, this assumption cannot always be ensured, and the actual domain shift problem significantly reduces the performance of few-shot learning. To remedy this problem, we investigate an interesting and challenging cross-domain few-shot learning task, where the training and testing tasks employ different domains. Specifically, we propose a Meta-Memory scheme to bridge the domain gap between source and target domains, leveraging style-memory and content-memory components. The former stores intra-domain style information from source domain instances and provides a richer feature distribution. The latter stores semantic information through exploration of knowledge of different categories. Under the contrastive learning strategy, our model effectively alleviates the cross-domain problem in few-shot learning. Extensive experiments demonstrate that our proposed method achieves state-of-the-art performance on cross-domain few-shot semantic segmentation tasks on the COCO-20 , PASCAL-5 , FSS-1000, and SUIM datasets and positively affects few-shot classification tasks on Meta-Dataset.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2023.3306352	DOI Listing

Publication Analysis

Top Keywords

few-shot learning

few-shot

cross-domain few-shot

learning

mmt cross

domain

cross domain

domain few-shot

learning meta-memory

meta-memory transfer

Similar Publications

Few-shot learning for highly accelerated 3D time-of-flight MRA reconstruction.

Magn Reson Med

September 2025

Centre for Integrative Neuroimaging, FMRIB Division, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK.

Hao Li , Mark Chiew , Iulius Dragonu , Peter Jezzard , Thomas W Okell

Purpose: To develop a deep learning-based reconstruction method for highly accelerated 3D time-of-flight MRA (TOF-MRA) that achieves high-quality reconstruction with robust generalization using extremely limited acquired raw data, addressing the challenge of time-consuming acquisition of high-resolution, whole-head angiograms.

Methods: A novel few-shot learning-based reconstruction framework is proposed, featuring a 3D variational network specifically designed for 3D TOF-MRA that is pre-trained on simulated complex-valued, multi-coil raw k-space datasets synthesized from diverse open-source magnitude images and fine-tuned using only two single-slab experimentally acquired datasets. The proposed approach was evaluated against existing methods on acquired retrospectively undersampled in vivo k-space data from five healthy volunteers and on prospectively undersampled data from two additional subjects.

View Article and Find Full Text PDF

Similar Publications

PM: A new prompting multi-modal model paradigm for few-shot medical image classification.

Comput Methods Programs Biomed

September 2025

Key Laboratory of Social Computing and Cognitive Intelligence (Ministry of Education), Dalian University of Technology, Dalian, 116024, China; School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China. Electronic address:

Zhenwei Wang , Qiule Sun , Bingbing Zhang , Pengfei Wang , Jianxin Zhang

Background And Objective: Few-shot learning has emerged as a key technological solution to address challenges such as limited data and the difficulty of acquiring annotations in medical image classification. However, relying solely on a single image modality is insufficient to capture conceptual categories. Therefore, medical image classification requires a comprehensive approach to capture conceptual category information that aids in the interpretation of image content.

View Article and Find Full Text PDF

Similar Publications

Guideline adherence in surgical decisions for T1 colorectal cancer after endoscopic resection: large language models vs clinicians.

Int J Surg

September 2025

Digestive Endoscopy Center, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, China.

Liangtang Zeng , Cao Qinxing , Junyuan Deng , Junnan Hu , Minghui Pang

Background: Patients with T1 colorectal cancer (CRC) often show poor adherence to guideline-recommended treatment strategies after endoscopic resection. To address this challenge and improve clinical decision-making, this study aims to compare the accuracy of surgical management recommendations between large language models (LLMs) and clinicians.

Methods: This retrospective study enrolled 202 patients with T1 CRC who underwent endoscopic resection at three hospitals.

View Article and Find Full Text PDF

Similar Publications

Performance of vision language models for optic disc swelling identification on fundus photographs.

Front Digit Health

August 2025

Department of Ophthalmology, Stanford University, Palo Alto, CA, United States.

Kelvin Zhenghao Li , Tuyet Thao Nguyen , Heather E Moss

Introduction: Vision language models (VLMs) combine image analysis capabilities with large language models (LLMs). Because of their multimodal capabilities, VLMs offer a clinical advantage over image classification models for the diagnosis of optic disc swelling by allowing a consideration of clinical context. In this study, we compare the performance of non-specialty-trained VLMs with different prompts in the classification of optic disc swelling on fundus photographs.

View Article and Find Full Text PDF

Similar Publications

Do large language models learn like humans: Interleaved and spaced practice in morphological learning.

Acta Psychol (Amst)

September 2025

Shanghai Jiao Tong University, China. Electronic address:

Ying Xiong , Shiyu Wu

This study investigates fundamental differences in the acquisition of morphological patterns by humans and large language models (LLMs) within an artificial language learning paradigm. Specifically, it compares how each system responds to variations in input structure-blocked versus interleaved sequences and juxtaposed versus spaced presentation-across verb classification and inflection tasks. While LLMs (GPT4mini, DeepSeek_V3, Llama3.

View Article and Find Full Text PDF

Similar Publications