98%
921
2 minutes
20
Background: Privacy is of increasing interest in the present big data era, particularly the privacy of medical data. Specifically, differential privacy has emerged as the standard method for preservation of privacy during data analysis and publishing.
Objective: Using machine learning techniques, we applied differential privacy to medical data with diverse parameters and checked the feasibility of our algorithms with synthetic data as well as the balance between data privacy and utility.
Methods: All data were normalized to a range between -1 and 1, and the bounded Laplacian method was applied to prevent the generation of out-of-bound values after applying the differential privacy algorithm. To preserve the cardinality of the categorical variables, we performed postprocessing via discretization. The algorithm was evaluated using both synthetic and real-world data (from the eICU Collaborative Research Database). We evaluated the difference between the original data and the perturbated data using misclassification rates and the mean squared error for categorical data and continuous data, respectively. Further, we compared the performance of classification models that predict in-hospital mortality using real-world data.
Results: The misclassification rate of categorical variables ranged between 0.49 and 0.85 when the value of ε was 0.1, and it converged to 0 as ε increased. When ε was between 10 and 10, the misclassification rate rapidly dropped to 0. Similarly, the mean squared error of the continuous variables decreased as ε increased. The performance of the model developed from perturbed data converged to that of the model developed from original data as ε increased. In particular, the accuracy of a random forest model developed from the original data was 0.801, and this value ranged from 0.757 to 0.81 when ε was 10 and 10, respectively.
Conclusions: We applied local differential privacy to medical domain data, which are diverse and high dimensional. Higher noise may offer enhanced privacy, but it simultaneously hinders utility. We should choose an appropriate degree of noise for data perturbation to balance privacy and utility depending on specific situations.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8663640 | PMC |
http://dx.doi.org/10.2196/26914 | DOI Listing |
PLOS Digit Health
September 2025
Department of Dermatology, Stanford University, Stanford, California, United States of America.
Large Language Models (LLMs) are increasingly deployed in clinical settings for tasks ranging from patient communication to decision support. While these models demonstrate race-based and binary gender biases, anti-LGBTQIA+ bias remains understudied despite documented healthcare disparities affecting these populations. In this work, we evaluated the potential of LLMs to propagate anti-LGBTQIA+ medical bias and misinformation.
View Article and Find Full Text PDFFront Digit Health
August 2025
KASTEL Security Research Labs, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany.
In medical environments, time-continuous data, such as electrocardiographic records, necessitates a distinct approach to anonymization due to the paramount importance of preserving its spatio-temporal integrity for optimal utility. A wide array of data types, characterized by their high sensitivity to the patient's well-being and their substantial interest to researchers, are generated. A significant proportion of this data may be of interest to researchers beyond the original purposes for which it was collected.
View Article and Find Full Text PDFJ Med Internet Res
September 2025
Fujian Psychiatric Center, Fujian Clinical Research Center for Mental Disorders, Xianyue Hospital Affiliated to Xiamen Medical College, Xiamen, China.
Background: In the digital health era, telemedicine has become a key driver of health care reform and innovation globally. Understanding the factors influencing residents' choices of telemedicine services is crucial for optimizing service design, enhancing user experience, and developing effective policy measures.
Objective: This study aims to explore the key factors influencing Chinese residents' choices of telemedicine services, including consultation fee, physician qualifications, appointment waiting time, scope of services, privacy protection, and service hours.
Bioinform Adv
August 2025
Department of Anatomy and Cell Biology, Medical School OWL, Bielefeld University, Bielefeld 33615, Germany.
Motivation: The growing use of transcriptomic data from platforms like Nanostring GeoMx DSP demands accessible and flexible tools for differential gene expression analysis and heatmap generation. Current web-based tools often lack transparency, modifiability, and independence from external servers creating barriers for researchers seeking customizable workflows, as well as data privacy and security. Additionally, tools that can be utilized by individuals with minimal bioinformatics expertise provide an inclusive solution, empowering a broader range of users to analyze complex data effectively.
View Article and Find Full Text PDFArtif Intell Med
November 2025
Department of Nuclear Medicine, Huzhou Central Hospital, Fifth School of Clinical Medicine of Zhejiang Chinese Medical University, Huzhou, 313001, China. Electronic address:
Positron Emission Tomography-Computed Tomography (PET-CT) evolution is critical for liver lesion diagnosis. However, data scarcity, privacy concerns, and cross-institutional imaging heterogeneity impede accurate deep learning model deployment. We propose a Federated Transfer Learning (FTL) framework that integrates federated learning's privacy-preserving collaboration with transfer learning's pre-trained model adaptation, enhancing liver lesion segmentation in PET-CT imaging.
View Article and Find Full Text PDF