FedDSS: A data-similarity approach for client selection in horizontal federated learning.

Int J Med Inform

SingHealth-Duke NUS Paediatrics Academic Clinical Programme, Duke-NUS Medical School, Singapore, 169857, Singapore; Children's Intensive Care Unit, KK Women's and Children's Hospital, Singapore, 229899, Singapore.

Published: December 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background And Objective: Federated learning (FL) is an emerging distributed learning framework allowing multiple clients (hospitals, institutions, smart devices, etc.) to collaboratively train a centralized machine learning model without disclosing personal data. It has the potential to address several healthcare challenges, including a lack of training data, data privacy, and security concerns. However, model learning under FL is affected by non-i.i.d. data, leading to severe model divergence and reduced performance due to the varying client's data distributions. To address this problem, we propose FedDSS, Federated Data Similarity Selection, a framework that uses a data-similarity approach to select clients, without compromising client data privacy.

Methods: FedDSS comprises a statistical-based data similarity metric, a N-similar-neighbor network, and a network-based selection strategy. We assessed FedDSS' performance against FedAvg's in i.i.d. and non-i.i.d. settings with two public pediatric sepsis datasets (PICD and MIMICIII). Selection fairness was measured using entropy. Simulations were repeated five times to evaluate average loss, true positive rate (TPR), and entropy.

Results: In i.i.d setting on PICD, FedDSS achieved a higher TPR starting from the 9th round and surpassing 0.6 three rounds earlier than FedAvg. On MIMICIII, FedDSS's loss decreases significantly from the 13th round, with TPR > 0.8 by the 2nd round, two rounds ahead of FedAvg (at the 4th round). In the non-i.i.d. setting, FedDSS achieved TPR > 0.7 by the 4th and > 0.8 by the 7th round, earlier than FedAvg (at the 5th and 11th rounds). In both settings, FedDSS showed reasonable fairness (entropy of 2.2 and 2.1).

Conclusion: We demonstrated that FedDSS contributes to improved learning in FL by achieving faster convergence, reaching the desired TPR with fewer communication rounds, and potentially enhancing sepsis prediction (TPR) over FedAvg.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijmedinf.2024.105650DOI Listing

Publication Analysis

Top Keywords

data-similarity approach
8
federated learning
8
data
8
data similarity
8
feddss achieved
8
earlier fedavg
8
feddss
7
learning
6
tpr
6
round
5

Similar Publications

Hybrid quantum enhanced federated learning for cyber attack detection.

Sci Rep

December 2024

Department of Computer Science and Engineering, E.G.S. Pillay Engineering College, Nagapattinam, Tamil Nadu, 611002, India.

Cyber-attack brings significant threat and become a critical issue in the digital world network security. The conventional procedures developed to detects are centralized and often struggles with concerns like data privacy and communication overheads. Due to this, conventional methods are unable to adapt quickly for different threats.

View Article and Find Full Text PDF

Challenges with Literature-Derived Data in Machine Learning for Yield Prediction: A Case Study on Pd-Catalyzed Carbonylation Reactions.

J Phys Chem A

December 2024

Centre for Computational Chemistry, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China.

The application of machine learning (ML) to predict reaction yields has shown remarkable accuracy when based on high-throughput computational and experimental data. However, the accuracy significantly diminishes when leveraging literature-derived data, highlighting a gap in the predictive capability of the current ML models. This study, focusing on Pd-catalyzed carbonylation reactions, reveals that even with a data set of 2512 reactions, the best-performing model reaches only an of 0.

View Article and Find Full Text PDF

FedDSS: A data-similarity approach for client selection in horizontal federated learning.

Int J Med Inform

December 2024

SingHealth-Duke NUS Paediatrics Academic Clinical Programme, Duke-NUS Medical School, Singapore, 169857, Singapore; Children's Intensive Care Unit, KK Women's and Children's Hospital, Singapore, 229899, Singapore.

Background And Objective: Federated learning (FL) is an emerging distributed learning framework allowing multiple clients (hospitals, institutions, smart devices, etc.) to collaboratively train a centralized machine learning model without disclosing personal data. It has the potential to address several healthcare challenges, including a lack of training data, data privacy, and security concerns.

View Article and Find Full Text PDF

Deep cross-modal hashing retrieval has recently made significant progress. However, existing methods generally learn hash functions with pairwise or triplet supervisions, which involves learning the relevant information by splicing partial similarity between data pairs; notably, this approach only captures the data similarity locally and incompletely, resulting in sub-optimal retrieval performance. In this paper, we propose a novel Multi-Relational Deep Hashing (MRDH) approach, which can fully bridge the modality gap by comprehensively modeling the similarity relationship between data in different modalities.

View Article and Find Full Text PDF

Predicting protein-ligand binding affinity presents a viable solution for accelerating the discovery of new lead compounds. The recent widespread application of machine learning approaches, especially graph neural networks, has brought new advancements in this field. However, some existing structure-based methods treat protein macromolecules and ligand small molecules in the same way and ignore the data heterogeneity, potentially leading to incomplete exploration of the biochemical information of ligands.

View Article and Find Full Text PDF