A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 1075
Function: getPubMedXML

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3195
Function: GetPubMedArticleOutput_2016

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Development of a Natural Language Processing Model for Extracting Kidney Biopsy Pathology Diagnoses. | LitMetric

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Rationale & Objective: Kidney biopsy reports are in a nonindexed text format, and the diagnosis requires labor-intensive manual abstraction. Natural language processing (NLP) has not been rigorously tested for kidney biopsy diagnosis extraction. Our objective was to develop an accurate model to extract the biopsy diagnosis from free-text reports.

Study Design: Text classification using NLP.

Setting & Participants: 2,666 patients with 3,042 native kidney biopsy reports in the Portable Document Format, from June 2016 to December 2023.

Predictor: Kidney biopsy diagnosis.

Outcomes: The performance of the NLP algorithm for all and the 20 most common diagnoses based on precision, recall, F1 score, and area under the receiver operating curve (AUROC).

Analytical Approach: A domain expert manually abstracted the diagnosis, and a renal pathologist validated a random subset (n = 200). Structured Query Language server and Python processed reports into machine-readable free text. We used PubMed Bidirectional Encoder Representations from Transformers to develop our NLP algorithm. We randomly split the reports into training (80%; n = 2,434) and testing (20%; n = 608) sets to train the NLP system. We further divided the testing set into 20% validation and 80% fine-tuning sets.

Results: The median age was 57 years, with 50% female, 29% African Americans, and 23% Hispanic participants. The 5 most frequent glomerular diagnoses were diabetic kidney disease (23.7%), focal segmental glomerulosclerosis (15.5%), lupus nephritis (9.7%), immunoglobulin A nephropathy (8.9), and membranous nephropathy (7.2%). The Cohen kappa coefficient for interrater reliability was 0.76. PubMed Bidirectional Encoder Representations from Transformers fine-tuned with a training set showed the average AUROC for NLP performance in the testing set of 0.95 across all diagnoses with an F1 score of 0.57. For the 20 most common diagnoses, the AUROC was 0.97 with an F1 score of 0.72. Limitations: Single centered; sample size and use limited to research purposes.

Conclusions: We demonstrate an accurate and scalable NLP system to extract the primary diagnosis from free-text kidney biopsy reports, which can facilitate epidemiologic studies and identify patients for clinical trial recruitment.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12311501PMC
http://dx.doi.org/10.1016/j.xkme.2025.101047DOI Listing

Publication Analysis

Top Keywords

kidney biopsy
24
biopsy reports
12
natural language
8
language processing
8
biopsy diagnosis
8
diagnosis free-text
8
nlp algorithm
8
common diagnoses
8
pubmed bidirectional
8
bidirectional encoder
8

Similar Publications