Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Objective: As the storage of clinical data has transitioned into electronic formats, medical informatics has become increasingly relevant in providing diagnostic aid. The purpose of this review is to evaluate machine learning models that use text data for diagnosis and to assess the diversity of the included study populations.

Methods: We conducted a systematic literature review on three public databases. Two authors reviewed every abstract for inclusion. Articles were included if they used or developed machine learning algorithms to aid in diagnosis. Articles focusing on imaging informatics were excluded.

Results: From 2,260 identified papers, we included 78. Of the machine learning models used, neural networks were relied upon most frequently (44.9%). Studies had a median population of 661.5 patients, and diseases and disorders of 10 different body systems were studied. Of the 35.9% ( = 28) of papers that included race data, 57.1% ( = 16) of study populations were majority White, 14.3% were majority Asian, and 7.1% were majority Black. In 75% ( = 21) of papers, White was the largest racial group represented. Of the papers included, 43.6% ( = 34) included the sex ratio of the patient population.

Discussion: With the power to build robust algorithms supported by massive quantities of clinical data, machine learning is shaping the future of diagnostics. Limitations of the underlying data create potential biases, especially if patient demographics are unknown or not included in the training.

Conclusion: As the movement toward clinical reliance on machine learning accelerates, both recording demographic information and using diverse training sets should be emphasized. Extrapolating algorithms to demographics beyond the original study population leaves large gaps for potential biases.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9132735PMC
http://dx.doi.org/10.1055/s-0042-1749119DOI Listing

Publication Analysis

Top Keywords

machine learning
24
papers included
12
clinical data
8
learning models
8
potential biases
8
included
7
learning
6
data
5
machine
5
diversity machine
4

Similar Publications

Aim: The purpose of this study was to assess the accuracy of a customized deep learning model based on CNN and U-Net for detecting and segmenting the second mesiobuccal canal (MB2) of maxillary first molar teeth on cone beam computed tomography (CBCT) scans.

Methodology: CBCT scans of 37 patients were imported into 3D slicer software to crop and segment the canals of the mesiobuccal (MB) root of the maxillary first molar. The annotated data were divided into two groups: 80% for training and validation and 20% for testing.

View Article and Find Full Text PDF

Obsessive-compulsive disorder (OCD) is a chronic and disabling condition affecting approximately 3.5% of the global population, with diagnosis on average delayed by 7.1 years or often confounded with other psychiatric disorders.

View Article and Find Full Text PDF

Early prediction of orthodontic gingival enlargement using S100A4: a biomarker-based risk stratification model.

Odontology

September 2025

Department of Periodontics, Saveetha Dental College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India.

Orthodontic-induced gingival enlargement (OIGE) affects approximately 15-30% of patients undergoing orthodontic treatment and remains largely unpredictable, often relying on subjective clinical assessments made after irreversible tissue changes have occurred. S100A4 is a well-characterized marker of activated fibroblasts involved in pathological tissue remodeling. This was a cross-sectional precision biomarker study that analyzed gingival tissue samples from three groups: healthy controls (n = 60), orthodontic patients without gingival enlargement (n = 31), and patients with clinically diagnosed OIGE (n = 61).

View Article and Find Full Text PDF

Purpose: The study aims to compare the treatment recommendations generated by four leading large language models (LLMs) with those from 21 sarcoma centers' multidisciplinary tumor boards (MTBs) of the sarcoma ring trial in managing complex soft tissue sarcoma (STS) cases.

Methods: We simulated STS-MTBs using four LLMs-Llama 3.2-vison: 90b, Claude 3.

View Article and Find Full Text PDF