Diversify and Conquer: Open-Set Disagreement for Robust Semi-Supervised Learning With Outliers.

Heejo Kong , Sung-Jin Kim , Gunho Jung , Seong-Whan Lee

IEEE Trans Neural Netw Learn Syst

Published: June 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Conventional semi-supervised learning (SSL) ideally assumes that labeled and unlabeled data share an identical class distribution; however, in practice, this assumption is easily violated, as unlabeled data often includes unknown class data, i.e., outliers. The outliers are treated as noise, considerably degrading the performance of SSL models. To address this drawback, we propose a novel framework, diversify and conquer (DAC), to enhance SSL robustness in the context of open-set SSL (OSSL). In particular, we note that existing OSSL methods rely on prediction discrepancies between inliers and outliers from a single model trained on labeled data. This approach can be easily failed when the labeled data are insufficient, leading to performance degradation that is worse than naive SSL that do not account for outliers. In contrast, our approach exploits prediction disagreements among multiple models that are differently biased toward the unlabeled distribution. By leveraging the discrepancies arising from training on unlabeled data, our method enables robust outlier detection, even when the labeled data are underspecified. Our key contribution is constructing a collection of differently biased models through a single training process. By encouraging divergent heads to be differently biased toward outliers while making consistent predictions for inliers, we exploit the disagreement among these heads as a measure to identify unknown concepts. Extensive experiments demonstrate that our method significantly surpasses state-of-the-art OSSL methods across various protocols.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TNNLS.2025.3547801	DOI Listing

Publication Analysis

Top Keywords

unlabeled data

labeled data

differently biased

diversify conquer

semi-supervised learning

ossl methods

data

outliers

ssl

conquer open-set

Similar Publications

Using Medication Dispensation Data to Identify Clusters with Similar Prescribing Patterns in Older Adults Living with Dementia.

Drugs Aging

September 2025

Dalla Lana School of Public Health, University of Toronto, V1 06, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada.

Abby Emdin , Therese A Stukel , Jennifer Bethell , Xuesong Wang , Andrea Iaboni

Background And Objectives: Older adults living with dementia are a heterogeneous group, which can make studying optimal medication management challenging. Unsupervised machine learning is a group of computing methods that rely on unlabeled data-that is, where the algorithm itself is discovering patterns without the need for researchers to label the data with a known outcome. These methods may help us to better understand complex prescribing patterns in this population.

View Article and Find Full Text PDF

Similar Publications

Multimodal self-supervised retinal vessel segmentation.

Neural Netw

September 2025

Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China. Electronic address:

Pengshuai Yin , Jingqi Zhang , Huichou Huang , Ruirui Liu , Yanxia Liu

Automatic segmentation of retinal vessels from retinography images is crucial for timely clinical diagnosis. However, the high cost and specialized expertise required for annotating medical images often result in limited labeled datasets, which constrains the full potential of deep learning methods. Recent advances in self-supervised pretraining using unlabeled data have shown significant benefits for downstream tasks.

View Article and Find Full Text PDF

Similar Publications

Enhancing Genetic Risk Prediction through Federated Semi-Supervised Transfer Learning with Inaccurate Electronic Health Record Data.

Stat Biosci

August 2024

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.

Yuying Lu , Tian Gu , Rui Duan

Large-scale genomics data combined with Electronic Health Records (EHRs) illuminate the path towards personalized disease management and enhanced medical interventions. However, the absence of "gold standard" disease labels makes the development of machine learning models a challenging task. Additionally, imbalances in demographic representation within datasets compromise the development of unbiased healthcare solutions.

View Article and Find Full Text PDF

Similar Publications

Multimodal deep learning methods for speech and language rehabilitation: a cross-sectional observational study.

Disabil Rehabil Assist Technol

September 2025

School of Foreign Languages, Ningbo University of Technology, Ningbo, China.

Xinqiao Cen

The speech and language rehabilitation are essential to people who have disorders of communication that may occur due to the condition of neurological disorder, developmental delays, or bodily disabilities. With the advent of deep learning, we introduce an improved multimodal rehabilitation pipeline that incorporates audio, video, and text information in order to provide patient-tailored therapy that adapts to the patient. The technique uses a cross-attention fusion multimodal hierarchical transformer architectural model that allows it to jointly design speech acoustics as well as the facial dynamics, lip articulation, and linguistic context.

View Article and Find Full Text PDF

Similar Publications

Data-Driven Insights in Single-Molecule Break Junction Studies: A Comprehensive Review of the Data Analysis Methods.

Langmuir

September 2025

School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, P. R. China.

Zhichao Pan , Yiheng Zhao , Ziyang Wang , Hengzhi Huang

Single-molecule electronics has emerged as a transformative field at the intersection of chemistry, physics, and nanotechnology, enabling the direct probing of charge transport phenomena at the molecular scale. The break junction technique, which measures conductance across metal-molecule-metal junctions, has become a cornerstone for studying single-molecule dynamics and quantum transport. However, interpreting the large-scale unlabeled conductance traces poses significant challenges.

View Article and Find Full Text PDF

Similar Publications