Pay more attention to the robustness of LLMs on adversarial prompt for instruction data mining.

Qiang Wang , Dawei Feng , Xu Zhang , Ao Shen , Yang Xu , Bo Ding , Huaimin Wang

Neural Netw

National Key Laboratory of Parallel and Distributed Computing, College of Computer Science and Technology, National University of Defense Technology, Hunan Changsha, 410073, China. Electronic address:

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Instruction tuning has emerged as a paramount method for tailoring the behaviors of LLMs. Recent studies have unveiled the potential for LLMs to achieve high performance through fine-tuning with a limited quantity of high-quality instruction data. Instruction-Following Difficulty is one of the most representative approaches in instruction data mining, which involves selecting samples where LLMs fail to generate response that align with the provided instructions as the high-quality instruction data. Building upon this approach, we further investigate how the robustness of LLMs to adversarial prompts influences the selection of high-quality instruction data. This paper proposes a pioneering framework of high-quality instruction data mining for instruction tuning, focusing on the impact of LLMs' robustness on adversarial prompts. Our notable innovation is to generate adversarial instruction data by attacking the prompts associated with instruction samples. Then, we introduce an Adversarial Instruction-Following Difficulty (AIFD) metric, which utilizes complete instruction sample pairs to identify samples with high adversarial instruction difficulty as high-quality instruction data. Apart from it, to address cases where LLM responses deviate from user intent, we further introduce a novel Adversarial Instruction Output Embedding Consistency (AIOEC) method that relies solely on instruction prompts to mine high-quality online instruction data. We conduct extensive experiments on two benchmark datasets to assess the performance. The experimental results serve to underscore the effectiveness of our proposed two methods. Moreover, the results underscore the critical practical significance of considering the robustness of LLMs on adversarial prompts for instruction data mining.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.neunet.2025.107989	DOI Listing

Publication Analysis

Top Keywords

instruction data

high-quality instruction

instruction

data mining

robustness llms

llms adversarial

adversarial prompts

adversarial instruction

data

adversarial

Similar Publications

Exploring healthcare professionals' perceptions on implementing home hemodialysis and self-assisted hemodialysis: a qualitative explorative study.

Int Urol Nephrol

September 2025

Division of Nursing, Singapore General Hospital, Singapore, Singapore.

Felice Fangie Leong , PeiYun Liu , Shien Wen Sheryl Gan , Chee Chin Phang , Siew Hoon Lim

Objective: To explore healthcare professionals' perceptions on the implementation of home hemodialysis and self-assisted hemodialysis in Singapore and to identify the perceived barriers, facilitators, and actionable strategies for increasing uptake.

Methods: This is a qualitative explorative study based on semi-structured face-to-face interviews conducted with a multidisciplinary group of 12 healthcare professionals at an acute teaching hospital in Singapore. Thematic analysis was used for data analysis.

View Article and Find Full Text PDF

Similar Publications

Design of Two Randomized, Placebo-Controlled, Phase 3 Trials of Deucravacitinib, an Oral, Selective, Allosteric TYK2 Inhibitor, in Systemic Lupus Erythematosus.

Adv Ther

September 2025

Bristol Myers Squibb, Princeton, NJ, 08540, USA.

Cristina Arriens , Eric F Morand , Anca D Askanase , Richard Furie , Ronald F van Vollenhoven

Background And Objectives: Deucravacitinib, a first-in-class, oral, selective, allosteric tyrosine kinase 2 inhibitor, demonstrated efficacy across the primary endpoint and all key secondary endpoints in the phase 2 PAISLEY SLE trial in patients with active systemic lupus erythematosus (SLE). Here, we describe 2 phase 3 trials [POETYK SLE-1 (NCT05617677), POETYK SLE-2 (NCT05620407)] which will assess the efficacy and safety of deucravacitinib in patients with active SLE. These phase 3 trials have been designed to replicate the successful elements of the phase 2 trial, including its glucocorticoid-tapering strategy and disease activity adjudication.

View Article and Find Full Text PDF

Similar Publications

Assessment of Past Dioxin Emissions from Waste Incineration Plants Based on Archive Studies and Process Modeling: A New Methodological Tool.

Arch Environ Contam Toxicol

September 2025

Ecole Polytechnique Fédérale de Lausanne (EPFL), School of Architecture, Civil and Environmental Engineering, 1015, Lausanne, Switzerland.

Xiaocheng Zhang , Alexis de Aragao , Fabien Moll-François , Aurélie Berthet , Florian Breider

Pollution from past industrial activities can remain unnoticed for years or even decades because the pollutant has only recently gained attention or been identified by measurements. Modeling the emission history of pollution is essential for estimating population exposure and apportioning potential liability among stakeholders. This paper proposes a novel approach for reconstructing the history of polychlorinated dibenzo-p-dioxin (PCDD) and polychlorinated dibenzofuran (PCDF) pollution from municipal solid waste incinerators (MSWIs) with unknown past emissions.

View Article and Find Full Text PDF

Similar Publications

TMBquant: an explainable AI-powered caller advancing tumor mutation burden quantification across heterogeneous samples.

Brief Bioinform

August 2025

Department of Respiratory Medicine, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157, Xiwu Road, Xincheng District, Xi'an 710004, China.

Shenjie Wang , Xiaonan Wang , Xiaoyan Zhu , Xuwen Wang , Yuqian Liu

Accurate tumor mutation burden (TMB) quantification is critical for immunotherapy stratification, yet remains challenging due to variability across sequencing platforms, tumor heterogeneity, and variant calling pipelines. Here, we introduce TMBquant, an explainable AI-powered caller designed to optimize TMB estimation through dynamic feature selection, ensemble learning, and automated strategy adaptation. Built upon the H2O AutoML framework, TMBquant integrates variant features, minimizes classification errors, and enhances both accuracy and stability across diverse datasets.

View Article and Find Full Text PDF

Similar Publications

Virtual reality simulation training for health professions trainees in gastrointestinal endoscopy.

Cochrane Database Syst Rev

September 2025

Division of Gastroenterology, Hepatology, and Nutrition, SickKids Research Institute and SickKids Learning Institute, The Hospital for Sick Children, Toronto, Ontario, Canada.

Nasruddin Sabrie , Rishad Khan , Joanne Plahouras , Bradley C Johnston , Michael A Scaffidi

Background: Training in endoscopy has traditionally been based upon an apprenticeship model, where novices develop their skills on real patients under the supervision of experienced endoscopists. In an effort to prioritise patient safety, simulation training has emerged as a means to allow novices to practice in a risk-free environment. This is the second update of the review, which was first published in 2012 and updated in 2018.

View Article and Find Full Text PDF

Similar Publications