A Novel Approach to Classifying Breast Cancer Histopathology Biopsy Images Using Bilateral Knowledge Distillation and Label Smoothing Regularization.

Sushovan Chaudhury , Nilesh Shelke , Kartik Sau , B Prasanalakshmi , Mohammad Shabaz

Comput Math Methods Med

Arba Minch University, Ethiopia.

Published: February 2022

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Breast cancer is the most common invasive cancer in women and the second main cause of cancer death in females, which can be classified benign or malignant. Research and prevention on breast cancer have attracted more concern of researchers in recent years. On the other hand, the development of data mining methods provides an effective way to extract more useful information from complex databases, and some prediction, classification, and clustering can be made according to the extracted information. The generic notion of knowledge distillation is that a network of higher capacity acts as a teacher and a network of lower capacity acts as a student. There are different pipelines of knowledge distillation known. However, previous work on knowledge distillation using label smoothing regularization produces experiments and results that break this general notion and prove that knowledge distillation also works when a student model distils a teacher model, i.e., reverse knowledge distillation. Not only this, but it is also proved that a poorly trained teacher model trains a student model to reach equivalent results. Building on the ideas from those works, we propose a novel bilateral knowledge distillation regime that enables multiple interactions between teacher and student models, i.e., teaching and distilling each other, eventually improving each other's performance and evaluating our results on BACH histopathology image dataset on breast cancer. The pretrained ResNeXt29 and MobileNetV2 models which are already tested on ImageNet dataset are used for "transfer learning" in our dataset, and we obtain a final accuracy of more than 96% using this novel approach of bilateral KD.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550839	PMC
http://dx.doi.org/10.1155/2021/4019358	DOI Listing

Publication Analysis

Top Keywords

knowledge distillation

breast cancer

novel approach

bilateral knowledge

distillation label

label smoothing

smoothing regularization

capacity acts

student model

teacher model

Similar Publications

Data-free knowledge distillation via text-noise fusion and dynamic adversarial temperature.

Neural Netw

September 2025

School of Computer Science, South China Normal University, Guangzhou, 510631, Guangdong, China; School of Artificial Intelligence, South China Normal University, Foshan, 528225, Guangdong, China. Electronic address:

Deheng Zeng , Zhengyang Wu , Yunwen Chen , Zhenhua Huang

Data-Free Knowledge Distillation (DFKD) have achieved significant breakthroughs, enabling the effective transfer of knowledge from teacher neural networks to student neural networks without reliance on original data. However, a significant challenge faced by existing methods that attempt to generate samples from random noise is that the noise lacks meaningful information, such as class-specific semantic information. Consequently, the absence of meaningful information makes it difficult for the generator to map this noise to the ground-truth data distribution, resulting in the generation of low-quality training samples.

View Article and Find Full Text PDF

Similar Publications

Melioristic Gerontology: Using Pragmatism to Reframe the Study of Aging.

Gerontologist

September 2025

Graduate Center for Gerontology, University of Kentucky, Lexington, KY, USA.

Malcolm Cutchin , Graham D Rowles

Aging populations in places around the globe face looming challenges from large-scale mega-trends. Gerontology needs to develop approaches for helping older people and their communities respond and share knowledge from those approaches. Based in the philosophy of pragmatism, we make a case for a 'melioristic gerontology' to focus gerontologists on those needs.

View Article and Find Full Text PDF

Similar Publications

Toward Effective Knowledge Distillation: Navigating Beyond Small-data Pitfall.

IEEE Trans Pattern Anal Mach Intell

September 2025

Zhiwei Hao , Jianyuan Guo , Kai Han , Han Hu , Chang Xu

The spectacular success of training large models on extensive datasets highlights the potential of scaling up for exceptional performance. To deploy these models on edge devices, knowledge distillation (KD) is commonly used to create a compact model from a larger, pretrained teacher model. However, as models and datasets rapidly scale up in practical applications, it is crucial to consider the applicability of existing KD approaches originally designed for limited-capacity architectures and small-scale datasets.

View Article and Find Full Text PDF

Similar Publications

Hospice care support priorities and perceptions of family caregivers of individuals with end-stage heart failure in China: a qualitative study.

BMJ Open

September 2025

Nursing Department, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China.

Chunyan Chen , Haixiang Zhu , Yehua Wang , Xiaoxue Han , Qijin Xu

Objectives: To gain an in-depth understanding of the real support priorities and perceptions of caregivers of individuals receiving care with end-stage heart failure regarding hospice care.

Design: A qualitative descriptive approach was employed.

Participants And Setting: Using a purposive sampling approach, 16 primary caregivers of individuals receiving care with end-stage heart failure from a tertiary hospital in Hangzhou, Zhejiang province, were selected as interview participants.

View Article and Find Full Text PDF

Similar Publications

Expandable Residual Approximation for Knowledge Distillation.

IEEE Trans Neural Netw Learn Syst

September 2025

Zhaoyi Yan , Binghui Chen , Yunfan Liu , Qixiang Ye

Knowledge distillation (KD) aims to transfer knowledge from a large-scale teacher model to a lightweight one, significantly reducing computational and storage requirements. However, the inherent learning capacity gap between the teacher and student often hinders the sufficient transfer of knowledge, motivating numerous studies to address this challenge. Inspired by the progressive approximation principle in the Stone-Weierstrass theorem, we propose expandable residual approximation (ERA), a novel KD method that decomposes the approximation of residual knowledge into multiple steps, reducing the difficulty of mimicking the teacher's representation through a divide-and-conquer approach.

View Article and Find Full Text PDF

Similar Publications