98%
921
2 minutes
20
Background: The advancement of information technology has immensely increased the quality and volume of health data. This has led to an increase in observational study, as well as to the threat of privacy invasion. Recently, a distributed research network based on the common data model (CDM) has emerged, enabling collaborative international medical research without sharing patient-level data. Although the CDM database for each institution is built inside a firewall, the risk of re-identification requires management. Hence, this study aims to elucidate the perceptions CDM users have towards CDM and risk management for re-identification.
Methods: The survey, targeted to answer specific in-depth questions on CDM, was conducted from October to November 2020. We targeted well-experienced researchers who actively use CDM. Basic statistics (total number and percent) were computed for all covariates.
Results: There were 33 valid respondents. Of these, 43.8% suggested additional anonymization was unnecessary beyond, "minimum cell count" policy, which obscures a cell with a value lower than certain number (usually 5) in shared results to minimize the liability of re-identification due to rare conditions. During extract-transform-load processes, 81.8% of respondents assumed structured data is under control from the risk of re-identification. However, respondents noted that date of birth and death were highly re-identifiable information. The majority of respondents (n = 22, 66.7%) conceded the possibility of identifier-contained unstructured data in the table.
Conclusion: Overall, CDM users generally attributed high reliability for privacy protection to the intrinsic nature of CDM. There was little demand for additional de-identification methods. However, unstructured data in the CDM were suspected to have risks. The necessity for a coordinating consortium to define and manage the re-identification risk of CDM was urged.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9259248 | PMC |
http://dx.doi.org/10.3346/jkms.2022.37.e205 | DOI Listing |
Front Genet
August 2025
Center for Biomedical Ethics and Society, Vanderbilt University Medical Center, Nashville, TN, United States.
Research carried out by Vanderbilt University's and Medical Center's federally-funded transdisciplinary, highly interactive GetPreCiSe Center in Excellence for ELSI research on genomic privacy-involving over 40 scholars across computer and social sciences, law, and the humanities-is summarized by dividing the work into five categories: (1) the nature of risks posed by collection of genetic data; (2) legal and scientific methods of minimizing those risks; (3) methods of safely increasing the scope of genetic databases; (4) public perceptions of genetic privacy; and (5) cultural depictions of genetic privacy. While this research shows that the risk of unauthorized re-identification is often over-stated, it also identifies possible ways privacy can be compromised. Several technical and legal methods for reducing privacy risks are described, most of which focus not on collection of the data, but rather on regulating data security, access, and use once it is collected.
View Article and Find Full Text PDFSci Rep
August 2025
Faculty of Medicine, Qilu Institute of Technology, 3028, East Jingshi Road, Jinan, 250000, Shangdong, China.
Heart failure is a significant global health challenge with high mortality rates. This study examines the association between glycemic variability and short-term mortality in critically ill heart failure patients. Data from the eICU Collaborative Research Database (eICU-CRD) and the Medical Information Mart for Intensive Care (MIMIC-IV) database were analyzed, including 23,744 heart failure patients.
View Article and Find Full Text PDFClin Trials
August 2025
Edinburgh Clinical Trials Unit (ECTU), Usher Institute, The University of Edinburgh, Edinburgh, UK.
BackgroundThe motivations to share anonymised datasets from clinical trials within the scientific community are increasing. Many anonymised datasets are now publicly available for secondary research. However, it is uncertain whether they pose a privacy risk to the involved participants.
View Article and Find Full Text PDFJ Genet Couns
August 2025
Scripps Research Translational Institute, La Jolla, California, USA.
Genetic data, more than ever, are a sought-after asset, often transferred or repurposed under broad privacy policies with limited transparency and passive consent. With the increasing frequency of company mergers and acquisitions within the genetics and genomics industry, growing and underacknowledged risks to patient data privacy have emerged. Though informed consent is a foundational element of clinical genetics, our current process rarely addresses what happens to patient data during business transitions.
View Article and Find Full Text PDFNPJ Digit Med
August 2025
Department of Anesthesiology, University of Michigan Medical School, Ann Arbor, MI, USA.
The sensitive nature of electronic health records (EHR) and wearable data presents challenges in sharing biomedical resources while minimizing re-identification risks. This article introduces an end-to-end, titratable pipeline that generates privacy-preserving "digital twin" datasets from complex EHR and wearable-device records (Apple Watch data from 3029 participants) using DataSifter and Synthetic Data Vault (SDV) methods. Various obfuscation levels were applied (DataSifter: small, medium, large; SDV: CTGAN, Gaussian Copula) and benchmarked using utility (statistical fidelity, machine learning performance) and privacy (re-identification risk, detection likelihood) metrics.
View Article and Find Full Text PDF