98%
921
2 minutes
20
Mitigating catastrophic forgetting remains a fundamental challenge in incremental learning. This paper identifies a key limitation of the widely used softmax cross-entropy loss: the non-identifiability inherent in the standard softmax cross-entropy distillation loss. To address this issue, we propose two complementary strategies: (1) adopting an imbalance-invariant distillation loss to mitigate the adverse effect of imbalanced weights during distillation, and (2) regularizing the original prediction/distillation loss with shift-sensitive alternatives, which render the optimization problem identifiable and proactively prevent imbalance from arising. These strategies form the foundation of five novel approaches that can be seamlessly integrated into existing distillation-based incremental learning frameworks such as LWF, LWM, and LUCIR. We validate the effectiveness of our approaches through extensive numerical experiments, demonstrating consistent improvements in predictive accuracy and substantial reductions in forgetting. For example, in a 10-task incremental learning setting on CIFAR-100, our methods improve the average accuracy of three widely used approaches - LWF, LWM, and LUCIR - by 11.8 %, 11.5 %, and 12.8 %, respectively, while reducing their average forgetting rates by 16.5 %, 16.8 %, and 13.8 %, respectively. Our code is publicly available at https://github.com/nexais/RethinkSoftmax.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.neunet.2025.108017 | DOI Listing |
PLoS One
September 2025
Department of Computer Science, COMSATS University Islamabad, Sahiwal, Pakistan.
The widespread dissemination of fake news presents a critical challenge to the integrity of digital information and erodes public trust. This urgent problem necessitates the development of sophisticated and reliable automated detection mechanisms. This study addresses this gap by proposing a robust fake news detection framework centred on a transformer-based architecture.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
September 2025
In industrial scenarios, semantic segmentation of surface defects is vital for identifying, localizing, and delineating defects. However, new defect types constantly emerge with product iterations or process updates. Existing defect segmentation models lack incremental learning capabilities, and direct fine-tuning (FT) often leads to catastrophic forgetting.
View Article and Find Full Text PDFJ Vis Exp
August 2025
Chitkara University Institute of Engineering & Technology, Chitkara University.
Emotion annotation in code-mixed languages like Hinglish (Hindi-English) presents unique challenges due to linguistic complexity and resource constraints. This study introduces a hybrid active learning framework that combines lexical rules, machine learning, and iterative expert feedback to achieve cost-efficient, high-accuracy emotion annotation. Grounded in psychological theories of emotion, including Discrete Emotions Theory and Cognitive Appraisal Theory, the framework employs bilingual emotion dictionaries (e.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
September 2025
Class incremental learning (CIL) offers a promising framework for continuous fault diagnosis (CFD), allowing networks to accumulate knowledge from streaming industrial data and recognize new fault classes. However, current CIL methods assume a balanced data stream, which does not align with the long-tail distribution of fault classes in real industrial scenarios. To fill this gap, this article investigates the impact of long-tail bias in the data stream on the CIL training process through the experimental analysis.
View Article and Find Full Text PDFTraffic Inj Prev
September 2025
Department of Civil Engineering, Sardar Vallabhbhai National Institute of Technology, Surat, India.
Objective: This study aimed to identify dynamic spatiotemporal traffic factors influencing conflict risk levels on National Highways under heterogeneous traffic conditions in India. The research addresses gaps by capturing vehicle interactions using high-resolution UAV-based trajectory data and proposes a novel two-stage methodology for real-time conflict risk evaluation, moving beyond traditional binary risk classifications to a four-level framework (High, Moderate, Low, No-Risk).
Methods: Over 40,000 conflict risk sequences were classified into four severity levels using the Modified Time-to-Collision (MTTC) surrogate safety measure.