Pay more attention to the robustness of LLMs on adversarial prompt for instruction data mining.

Neural Netw

National Key Laboratory of Parallel and Distributed Computing, College of Computer Science and Technology, National University of Defense Technology, Hunan Changsha, 410073, China. Electronic address:

Published: August 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Instruction tuning has emerged as a paramount method for tailoring the behaviors of LLMs. Recent studies have unveiled the potential for LLMs to achieve high performance through fine-tuning with a limited quantity of high-quality instruction data. Instruction-Following Difficulty is one of the most representative approaches in instruction data mining, which involves selecting samples where LLMs fail to generate response that align with the provided instructions as the high-quality instruction data. Building upon this approach, we further investigate how the robustness of LLMs to adversarial prompts influences the selection of high-quality instruction data. This paper proposes a pioneering framework of high-quality instruction data mining for instruction tuning, focusing on the impact of LLMs' robustness on adversarial prompts. Our notable innovation is to generate adversarial instruction data by attacking the prompts associated with instruction samples. Then, we introduce an Adversarial Instruction-Following Difficulty (AIFD) metric, which utilizes complete instruction sample pairs to identify samples with high adversarial instruction difficulty as high-quality instruction data. Apart from it, to address cases where LLM responses deviate from user intent, we further introduce a novel Adversarial Instruction Output Embedding Consistency (AIOEC) method that relies solely on instruction prompts to mine high-quality online instruction data. We conduct extensive experiments on two benchmark datasets to assess the performance. The experimental results serve to underscore the effectiveness of our proposed two methods. Moreover, the results underscore the critical practical significance of considering the robustness of LLMs on adversarial prompts for instruction data mining.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2025.107989DOI Listing

Publication Analysis

Top Keywords

instruction data
40
high-quality instruction
20
instruction
17
data mining
16
robustness llms
12
llms adversarial
12
adversarial prompts
12
adversarial instruction
12
data
10
adversarial
8

Similar Publications

Objective: To explore healthcare professionals' perceptions on the implementation of home hemodialysis and self-assisted hemodialysis in Singapore and to identify the perceived barriers, facilitators, and actionable strategies for increasing uptake.

Methods: This is a qualitative explorative study based on semi-structured face-to-face interviews conducted with a multidisciplinary group of 12 healthcare professionals at an acute teaching hospital in Singapore. Thematic analysis was used for data analysis.

View Article and Find Full Text PDF

Background And Objectives: Deucravacitinib, a first-in-class, oral, selective, allosteric tyrosine kinase 2 inhibitor, demonstrated efficacy across the primary endpoint and all key secondary endpoints in the phase 2 PAISLEY SLE trial in patients with active systemic lupus erythematosus (SLE). Here, we describe 2 phase 3 trials [POETYK SLE-1 (NCT05617677), POETYK SLE-2 (NCT05620407)] which will assess the efficacy and safety of deucravacitinib in patients with active SLE. These phase 3 trials have been designed to replicate the successful elements of the phase 2 trial, including its glucocorticoid-tapering strategy and disease activity adjudication.

View Article and Find Full Text PDF

Pollution from past industrial activities can remain unnoticed for years or even decades because the pollutant has only recently gained attention or been identified by measurements. Modeling the emission history of pollution is essential for estimating population exposure and apportioning potential liability among stakeholders. This paper proposes a novel approach for reconstructing the history of polychlorinated dibenzo-p-dioxin (PCDD) and polychlorinated dibenzofuran (PCDF) pollution from municipal solid waste incinerators (MSWIs) with unknown past emissions.

View Article and Find Full Text PDF

Accurate tumor mutation burden (TMB) quantification is critical for immunotherapy stratification, yet remains challenging due to variability across sequencing platforms, tumor heterogeneity, and variant calling pipelines. Here, we introduce TMBquant, an explainable AI-powered caller designed to optimize TMB estimation through dynamic feature selection, ensemble learning, and automated strategy adaptation. Built upon the H2O AutoML framework, TMBquant integrates variant features, minimizes classification errors, and enhances both accuracy and stability across diverse datasets.

View Article and Find Full Text PDF

Virtual reality simulation training for health professions trainees in gastrointestinal endoscopy.

Cochrane Database Syst Rev

September 2025

Division of Gastroenterology, Hepatology, and Nutrition, SickKids Research Institute and SickKids Learning Institute, The Hospital for Sick Children, Toronto, Ontario, Canada.

Background: Training in endoscopy has traditionally been based upon an apprenticeship model, where novices develop their skills on real patients under the supervision of experienced endoscopists. In an effort to prioritise patient safety, simulation training has emerged as a means to allow novices to practice in a risk-free environment. This is the second update of the review, which was first published in 2012 and updated in 2018.

View Article and Find Full Text PDF