Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

A growing quantity of health data is being stored in Electronic Health Records (EHR). The free-text section of these clinical notes contains important patient and treatment information for research but also contains Personally Identifiable Information (PII), which cannot be freely shared within the research community without compromising patient confidentiality and privacy rights. Significant work has been invested in investigating automated approaches to text de-identification, the process of removing or redacting PII. Few studies have examined the performance of existing de-identification pipelines in a controlled comparative analysis. In this study, we use publicly available corpora to analyze speed and accuracy differences between three de-identification systems that can be run off-the-shelf: Amazon Comprehend Medical PHId, Clinacuity's CliniDeID, and the National Library of Medicine's Scrubber. No single system dominated all the compared metrics. NLM Scrubber was the fastest while CliniDeID generally had the highest accuracy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7233098PMC

Publication Analysis

Top Keywords

comparative analysis
8
speed accuracy
8
analysis speed
4
accuracy three
4
three off-the-shelf
4
de-identification
4
off-the-shelf de-identification
4
de-identification tools
4
tools growing
4
growing quantity
4

Similar Publications

Background: Survivors of critical illness frequently face physical, cognitive and psychological impairments after intensive care. Sensorimotor impairments potentially have a negative impact on participation. However, comprehensive understanding of sensorimotor recovery and participation in survivors of critical illness is limited.

View Article and Find Full Text PDF

Dietary inflammatory index and the risk of colorectal adenomas and cancer: a systematic review and dose-response meta-analysis.

Nutr J

September 2025

Department of Gastroenterology and Hepatology, Hangzhou Red Cross Hospital, 208 Huancheng Dong Road, Hangzhou, 310003, Zhejiang Province, China.

Background: The potential association between dietary inflammatory index (DII) and colorectal cancer (CRC) risk, as well as colorectal adenomas (CRA) risk, has been extensively studied, but the findings remain inconclusive. We conducted this systematic review and dose-response meta-analysis to investigate the relationship between the DII and CRC and CRA.

Methods: We comprehensively searched the PubMed, Embase, Cochrane Library, and Web of Science databases for cohort and case-control studies reporting the relationship between DII and CRA, or between DII and CRC, as of 15 July 2025.

View Article and Find Full Text PDF

Background: Between November 2023 and March 2024, coastal Kenya experienced another wave of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections detected through our continued genomic surveillance. Herein, we report the clinical and genomic epidemiology of SARS-CoV-2 infections from 179 individuals (a total of 185 positive samples) residing in the Kilifi Health and Demographic Surveillance System (KHDSS) area (~ 900 km).

Methods: We analyzed genetic, clinical, and epidemiological data from SARS-CoV-2 positive cases across pediatric inpatient, health facility outpatient, and homestead community surveillance platforms.

View Article and Find Full Text PDF

Background: Gastric cancer is one of the most common cancers worldwide, with its prognosis influenced by factors such as tumor clinical stage, histological type, and the patient's overall health. Recent studies highlight the critical role of lymphatic endothelial cells (LECs) in the tumor microenvironment. Perturbations in LEC function in gastric cancer, marked by aberrant activation or damage, disrupt lymphatic fluid dynamics and impede immune cell infiltration, thereby modulating tumor progression and patient prognosis.

View Article and Find Full Text PDF

Background: Most RNA-seq datasets harbor genes with extreme expression levels in some samples. Such extreme outliers are usually treated as technical errors and are removed from the data before further statistical analysis. Here we focus on the patterns of such outlier gene expression to investigate whether they provide insights into the underlying biology.

View Article and Find Full Text PDF