Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Privacy should be protected in medical data that include patient information. A distributed research network (DRN) is one of the challenges in privacy protection and in the encouragement of multi-institutional clinical research. A DRN standardizes multi-institutional data into a common structure and terminology called a common data model (CDM), and it only shares analysis results. It is necessary to measure how a DRN protects patient information privacy even without sharing data in practice.

Objective: This study aimed to quantify the privacy risk of a DRN by comparing different deidentification levels focusing on personal health identifiers (PHIs) and quasi-identifiers (QIs).

Methods: We detected PHIs and QIs in an Observational Medical Outcomes Partnership (OMOP) CDM as threatening privacy, based on 18 Health Insurance Portability and Accountability Act of 1996 (HIPPA) identifiers and previous studies. To compare the privacy risk according to the different privacy policies, we generated limited and safe harbor data sets based on 16 PHIs and 12 QIs as threatening privacy from the Synthetic Public Use File 5 Percent (SynPUF5PCT) data set, which is a public data set of the OMOP CDM. With minimum cell size and equivalence class methods, we measured the privacy risk reduction with a trust differential gap obtained by comparing the two data sets. We also measured the gap in randomly sampled records from the two data sets to adjust the number of PHI or QI records.

Results: The gaps averaged 31.448% and 73.798% for PHIs and QIs, respectively, with a minimum cell size of one, which represents a unique record in a data set. Among PHIs, the national provider identifier had the highest gap of 71.236% (71.244% and 0.007% in the limited and safe harbor data sets, respectively). The maximum size of the equivalence class, which has the largest size of an indistinguishable set of records, averaged 771. In 1000 random samples of PHIs, Device_exposure_start_date had the highest gap of 33.730% (87.705% and 53.975% in the data sets). Among QIs, Death had the highest gap of 99.212% (99.997% and 0.784% in the data sets). In 1000, 10,000, and 100,000 random samples of QIs, Device_treatment had the highest gaps of 12.980% (99.980% and 87.000% in the data sets), 60.118% (99.831% and 39.713%), and 93.597% (98.805% and 5.207%), respectively, and in 1 million random samples, Death had the highest gap of 99.063% (99.998% and 0.934% in the data sets).

Conclusions: In this study, we verified and quantified the privacy risk of PHIs and QIs in the DRN. Although this study used limited PHIs and QIs for verification, the privacy limitations found in this study could be used as a quality measurement index for deidentification of multi-institutional collaboration research, thereby increasing DRN safety.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8204238PMC
http://dx.doi.org/10.2196/24940DOI Listing

Publication Analysis

Top Keywords

data sets
28
phis qis
20
privacy risk
16
highest gap
16
data
15
data set
12
random samples
12
privacy
11
personal health
8
health identifiers
8

Similar Publications

On use of adaptive cluster sampling for variance estimation.

J Appl Stat

February 2025

Department of Mathematics & Statistics, International Islamic University, Islamabad, Pakistan.

Adaptive cluster sampling is particularly helpful whenever the target population is unique, dispersed unevenly, concealed or difficult to find. In the current investigation, under an adaptive cluster sampling approach, we propose a ratio-product-logarithmic type estimator employing a single auxiliary variable for the estimation of finite population variance. The bias and mean square error of the proposed estimator are developed by using simulation as well as real data sets.

View Article and Find Full Text PDF

When analyzing real data sets, statisticians often face the question that the data are heterogeneous and it may not necessarily be possible to model this heterogeneity directly. One natural option in this case is to use the methods based on finite mixtures. The key question in these techniques often is what is the best number of mixtures or, depending on the focus of the analysis, the best number of sub-populations when the model is otherwise fixed.

View Article and Find Full Text PDF

Machine Learning-Aided Screening and Design Rule Discovery for LWIR-Transparent Optical Materials.

J Chem Inf Model

September 2025

Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721-0041, United States.

The development of low-cost, high-performance materials with enhanced transparency in the long-wavelength infrared (LWIR) region (800-1250 cm/8-12.5 μm) is essential for advancing thermal imaging and sensing technologies. Traditional LWIR optics rely on costly inorganic materials, limiting their broader deployment.

View Article and Find Full Text PDF

Priorities for 'out-of-hours' home-based palliative care for professionals, patients, and family caregivers: A qualitative interview study.

Int J Nurs Stud

August 2025

Florence Nightingale Faculty of Nursing Midwifery and Palliative Care, Cicely Saunders Institute of Palliative Care, Policy and Rehabilitation, King's College London, Bessemer Road, London SE5 9PJ, UK; Sussex Community NHS Foundation Trust, Brighton General Hospital, Elm Grove, Brighton, East Sussex

Background: People with advanced illness at home, and their families, rely on 'out-of-hours' services provided by community, primary and specialist palliative care services. Home is commonly expressed as the preferred place to be cared for and die, and an increasing proportion of people are dying at home, but what constitutes 'good' care is poorly understood from the combined perspectives of healthcare professionals and patients and family caregivers.

Objective: To understand the convergence and divergence of the perspectives of healthcare professionals with those of patients and family caregivers, on priorities for home-based palliative care in the 'out-of-hours' period in the UK.

View Article and Find Full Text PDF

Solvation Structure of Np in a Noncomplexing Environment.

Inorg Chem

September 2025

Pacific Northwest National Laboratory, Richland, Washington 99352, United States.

The solvation structure of an Np ion in an aqueous, noncomplexing and nonoxidizing environment of trifluoromethanesulfonic (triflic) acid was investigated with X-ray absorption spectroscopy (XAS) combined with ab initio molecular dynamics (AIMD) and time-dependent density functional theory (TDDFT) calculations. Np L-edge X-ray absorption near-edge structure (XANES) and extended X-ray absorption fine structure (EXAFS) data were collected for Np in 1, 3, and 7 M triflic acid using a laboratory-scale spectrometer and separately at a synchrotron facility, producing data sets in excellent agreement. TDDFT calculations revealed a weak pre-edge feature not previously reported for Np L-edge XANES.

View Article and Find Full Text PDF