Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Prompt tuning, a recently emerging paradigm, adapts vision-language pre-trained models to new tasks efficiently by learning "soft prompts" for frozen models. However, in few-shot scenarios, its effectiveness is limited by sensitivity to the initialization and the time-consuming search for optimal initialization, hindering rapid adaptation. Additionally, prompt tuning risks reducing the models' generalizability due to overfitting on scarce training samples. To overcome these challenges, we introduce a novel Gradient-RegulAted Meta-prompt learning (GRAM) framework that jointly meta-learns an efficient soft prompt initialization for better adaptation and a lightweight gradient regulating function for strong cross-domain generalizability in a meta-learning paradigm using only the weakly labeled image-text pre-training data. This is achieved through a Cross-Modal Hierarchical Clustering algorithm that organizes extensive image-text data into a structured hierarchy, facilitating robust meta-learning across diverse domains. Rather than designing a specific prompt tuning method, our GRAM can be easily incorporated into various prompt tuning methods in a model-agnostic way and bring about consistent improvement for them. Further, we consider a more practical but challenging setting: test-time prompt tuning with only unlabeled test samples and propose an improved structure-induced gradient regulating function to leverage the structured semantics of the meta-learning data for zero-shot generalization. This novel approach exploits the hierarchically clustered meta-learning data to model relationships between test-time data and meta-learning prototypes, facilitating the transfer of invariant knowledge without explicit annotations. Meanwhile, we introduce a structure complexity-informed strategy for adaptively constructing meta-training tasks and generating prototypes, which fully considers the diverse semantics within hierarchical clusters of different complexities. Comprehensive experiments demonstrate the state-of-the-art few- and zero-shot generalizability of our method.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2025.3604454DOI Listing

Publication Analysis

Top Keywords

prompt tuning
20
structure-induced gradient
8
gradient regulating
8
regulating function
8
meta-learning data
8
prompt
6
tuning
5
meta-learning
5
data
5
gradient regulation
4

Similar Publications

Temporal modeling plays an important role in the effective adaption of the powerful pretrained text-image foundation model into text-video retrieval. However, existing methods often rely on additional heavy trainable modules, such as transformer or BiLSTM, which are inefficient. In contrast, we avoid introducing such heavy components by leveraging frozen foundation models.

View Article and Find Full Text PDF

BACKGROUND This study used CT imaging analyzed with deep learning techniques to assess the diagnostic accuracy of lung metastasis detection in patients with breast cancer. The aim of the research was to create and verify a system for detecting malignant and metastatic lung lesions that uses YOLOv10 and transfer learning. MATERIAL AND METHODS From January 2023 to 2024, CT scans of 16 patients with breast cancer who had confirmed lung metastases were gathered retrospectively from Erzincan Mengücek Gazi Training and Research Hospital.

View Article and Find Full Text PDF

Large Language Models (LLMs) show promise in augmenting digital health applications. However, development and scaling of large models face computational constraints, data security concerns and limitations of internet accessibility in some regions. We developed and tested Med-Pal, a medical domain-specific LLM-chatbot fine-tuned with a fine-grained, expert curated medication-enquiry dataset consisting of 1,100 question and answer pairs.

View Article and Find Full Text PDF

Pay more attention to the robustness of LLMs on adversarial prompt for instruction data mining.

Neural Netw

August 2025

National Key Laboratory of Parallel and Distributed Computing, College of Computer Science and Technology, National University of Defense Technology, Hunan Changsha, 410073, China. Electronic address:

Instruction tuning has emerged as a paramount method for tailoring the behaviors of LLMs. Recent studies have unveiled the potential for LLMs to achieve high performance through fine-tuning with a limited quantity of high-quality instruction data. Instruction-Following Difficulty is one of the most representative approaches in instruction data mining, which involves selecting samples where LLMs fail to generate response that align with the provided instructions as the high-quality instruction data.

View Article and Find Full Text PDF

Medical Entity Linking in Low-Resource Settings with Fine-Tuning-Free LLMs.

Stud Health Technol Inform

September 2025

Chair of Medical Informatics, Institute of AI and Informatics in Medicine (AIIM), TUM University Hospital, Technical University of Munich, Munich, Germany.

Introduction: Medical entity linking is an important task in biomedical natural language processing, aiming to align textual mentions of medical concepts with standardized concepts in ontologies. Most existing approaches rely on supervised models or domain-specific embeddings, which require large datasets and significant computational resources.

Objective: The objective of this work is (1) to investigate the effectiveness of large language models (LLMs) in improving both candidate generation and disambiguation for medical entity linking through synonym expansion and in-context learning, and (2) to evaluate this approach against traditional string-matching and supervised methods.

View Article and Find Full Text PDF