A Comparative Analysis of Speed and Accuracy for Three Off-the-Shelf De-Identification Tools.

Paul M Heider , Jihad S Obeid , Stéphane M Meystre

AMIA Jt Summits Transl Sci Proc

Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC.

Published: May 2020

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

A growing quantity of health data is being stored in Electronic Health Records (EHR). The free-text section of these clinical notes contains important patient and treatment information for research but also contains Personally Identifiable Information (PII), which cannot be freely shared within the research community without compromising patient confidentiality and privacy rights. Significant work has been invested in investigating automated approaches to text de-identification, the process of removing or redacting PII. Few studies have examined the performance of existing de-identification pipelines in a controlled comparative analysis. In this study, we use publicly available corpora to analyze speed and accuracy differences between three de-identification systems that can be run off-the-shelf: Amazon Comprehend Medical PHId, Clinacuity's CliniDeID, and the National Library of Medicine's Scrubber. No single system dominated all the compared metrics. NLM Scrubber was the fastest while CliniDeID generally had the highest accuracy.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7233098	PMC

Publication Analysis

Top Keywords

comparative analysis

speed accuracy

analysis speed

accuracy three

three off-the-shelf

de-identification

off-the-shelf de-identification

de-identification tools

tools growing

growing quantity

Similar Publications

Long-term recovery of sensorimotor functions and prediction of participation in survivors of critical illness: a prospective cohort study.

J Intensive Care

September 2025

German Center for Vertigo and Balance Disorders, Ludwig-Maximilians-Universitat (LMU), University Hospital Grosshadern, Munich, Germany.

Johanna Weghorn , Melanie Finsterhölzl , Franziska Wippenbeck , Klaus Jahn , Marion Egger

Background: Survivors of critical illness frequently face physical, cognitive and psychological impairments after intensive care. Sensorimotor impairments potentially have a negative impact on participation. However, comprehensive understanding of sensorimotor recovery and participation in survivors of critical illness is limited.

View Article and Find Full Text PDF

Similar Publications

Dietary inflammatory index and the risk of colorectal adenomas and cancer: a systematic review and dose-response meta-analysis.

Nutr J

September 2025

Department of Gastroenterology and Hepatology, Hangzhou Red Cross Hospital, 208 Huancheng Dong Road, Hangzhou, 310003, Zhejiang Province, China.

Yi-Jun Wu , Wen-Hua Wang , Yu-Ping Wang , Hong Xu

Background: The potential association between dietary inflammatory index (DII) and colorectal cancer (CRC) risk, as well as colorectal adenomas (CRA) risk, has been extensively studied, but the findings remain inconclusive. We conducted this systematic review and dose-response meta-analysis to investigate the relationship between the DII and CRC and CRA.

Methods: We comprehensively searched the PubMed, Embase, Cochrane Library, and Web of Science databases for cohort and case-control studies reporting the relationship between DII and CRA, or between DII and CRC, as of 15 July 2025.

View Article and Find Full Text PDF

Similar Publications

Genomic and clinical epidemiology of SARS-CoV-2 in coastal Kenya: insights into variant circulation, reinfection, and multiple lineage importations during a post-pandemic wave.

BMC Glob Public Health

September 2025

Kenya Medical Research Institute (KEMRI) - Wellcome Trust Research Programme (KWTRP), Kilifi, Kenya.

Arnold W Lambisia , Esther N Katama , Edidah Moraa , John M Mwita , Katherine Gallagher

Background: Between November 2023 and March 2024, coastal Kenya experienced another wave of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections detected through our continued genomic surveillance. Herein, we report the clinical and genomic epidemiology of SARS-CoV-2 infections from 179 individuals (a total of 185 positive samples) residing in the Kilifi Health and Demographic Surveillance System (KHDSS) area (~ 900 km).

Methods: We analyzed genetic, clinical, and epidemiological data from SARS-CoV-2 positive cases across pediatric inpatient, health facility outpatient, and homestead community surveillance platforms.

View Article and Find Full Text PDF

Similar Publications

Development and validation of a gastric cancer prognostic model utilizing lymphatic endothelial cell-related genes.

Diagn Pathol

September 2025

Department of Gastrointestinal Medical Oncology, Fudan University Shanghai Cancer Center, Shanghai, 200032, China.

Sijie Sun , Jieyun Zhang , Weijian Guo

Background: Gastric cancer is one of the most common cancers worldwide, with its prognosis influenced by factors such as tumor clinical stage, histological type, and the patient's overall health. Recent studies highlight the critical role of lymphatic endothelial cells (LECs) in the tumor microenvironment. Perturbations in LEC function in gastric cancer, marked by aberrant activation or damage, disrupt lymphatic fluid dynamics and impede immune cell infiltration, thereby modulating tumor progression and patient prognosis.

View Article and Find Full Text PDF

Similar Publications

Patterns of extreme outlier gene expression suggest an edge of chaos effect in transcriptomic networks.

Genome Biol

September 2025

Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, Plön, Germany.

Chen Xie , Sven Künzel , Wenyu Zhang , Cassandra A Hathaway , Shelley S Tworoger

Background: Most RNA-seq datasets harbor genes with extreme expression levels in some samples. Such extreme outliers are usually treated as technical errors and are removed from the data before further statistical analysis. Here we focus on the patterns of such outlier gene expression to investigate whether they provide insights into the underlying biology.

View Article and Find Full Text PDF

Similar Publications