98%
921
2 minutes
20
Background: Artificial intelligence (AI) seems promising in diagnosing pneumonia on chest x-rays (CXR), but deep learning (DL) algorithms have primarily been compared with radiologists, whose diagnosis can be not completely accurate. Therefore, we evaluated the accuracy of DL in diagnosing pneumonia on CXR using a more robust reference diagnosis.
Methods: We trained a DL convolutional neural network model to diagnose pneumonia and evaluated its accuracy in two prospective pneumonia cohorts including 430 patients, for whom the reference diagnosis was determined a posteriori by a multidisciplinary expert panel using multimodal data. The performance of the DL model was compared with that of senior radiologists and emergency physicians reviewing CXRs and that of radiologists reviewing computed tomography (CT) performed concomitantly.
Results: Radiologists and DL showed a similar accuracy on CXR for both cohorts (p ≥ 0.269): cohort 1, radiologist 1 75.5% (95% confidence interval 69.1-80.9), radiologist 2 71.0% (64.4-76.8), DL 71.0% (64.4-76.8); cohort 2, radiologist 70.9% (64.7-76.4), DL 72.6% (66.5-78.0). The accuracy of radiologists and DL was significantly higher (p ≤ 0.022) than that of emergency physicians (cohort 1 64.0% [57.1-70.3], cohort 2 63.0% [55.6-69.0]). Accuracy was significantly higher for CT (cohort 1 79.0% [72.8-84.1], cohort 2 89.6% [84.9-92.9]) than for CXR readers including radiologists, clinicians, and DL (all p-values < 0.001).
Conclusions: When compared with a robust reference diagnosis, the performance of AI models to identify pneumonia on CXRs was inferior than previously reported but similar to that of radiologists and better than that of emergency physicians.
Relevance Statement: The clinical relevance of AI models for pneumonia diagnosis may have been overestimated. AI models should be benchmarked against robust reference multimodal diagnosis to avoid overestimating its performance.
Trial Registration: NCT02467192 , and NCT01574066 .
Key Point: • We evaluated an openly-access convolutional neural network (CNN) model to diagnose pneumonia on CXRs. • CNN was validated against a strong multimodal reference diagnosis. • In our study, the CNN performance (area under the receiver operating characteristics curve 0.74) was lower than that previously reported when validated against radiologists' diagnosis (0.99 in a recent meta-analysis). • The CNN performance was significantly higher than emergency physicians' (p ≤ 0.022) and comparable to that of board-certified radiologists (p ≥ 0.269).
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10834924 | PMC |
http://dx.doi.org/10.1186/s41747-023-00416-y | DOI Listing |
J Eval Clin Pract
September 2025
Department of Orthopedics and Traumatology, Medical Faculty, University of Health Sciences, Antalya, Turkey.
Aims And Objective: The field of medical statistics has experienced significant advancements driven by integrating innovative statistical methodologies. This study aims to conduct a comprehensive analysis to explore current trends, influential research areas, and future directions in medical statistics.
Methods: This paper maps the evolution of statistical methods used in medical research based on 4,919 relevant publications retrieved from the Web of Science.
Dermatitis
September 2025
From the Department of Dermatology, Venereology and Leprology, All India Institute of Medical Sciences (AIIMS), Bhopal, India.
Contact dermatitis (CD), which includes both allergic CD and irritant CD, is a common inflammatory condition that can pose significant diagnostic challenges. Although patch testing is the gold standard for identifying causative allergens for allergic contact dermatitis (ACD), it is time-consuming, subjective, and requires expert interpretation. Recent advancements in artificial intelligence (AI), particularly in machine learning (ML) and deep learning, have shown promise in improving the accuracy, efficiency, and accessibility of CD diagnosis and management.
View Article and Find Full Text PDFElectromagn Biol Med
September 2025
Computer Science and Business Systems, Sri Krishna College of Engineering and Technology, Coimbatore, India.
Subject-independent emotion detection using EEG (Electroencephalography) using Vibrational Mode Decomposition and deep learning is made possible by the scarcity of labelled EEG datasets encompassing a variety of emotions. Labelled EEG data collection over a wide range of emotional states from a broad and varied population is challenging and resource-intensive. As a result, models trained on small or biased datasets may fail to generalize well to unknown individuals or emotional states, resulting in lower accuracy and robustness in real-world applications.
View Article and Find Full Text PDFNan Fang Yi Ke Da Xue Xue Bao
August 2025
School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
Objectives: We propose a myocardial infarction (MI) detection and localization model for improving the diagnostic accuracy for MI to provide assistance to clinical decision-making.
Methods: The proposed model was constructed based on multi-scale field residual blocks fusion modified channel attention (MSF-RB-MCA). The model utilizes lead II electrocardiogram (ECG) signals to detect and localize MI, and extracts different levels of feature information through the multi-scale field residual block.
Ren Fail
December 2025
Department of Nephrology, The Affiliated Hospital of Qingdao University, Qingdao, China.
Large language models (LLMs) represent a transformative advance in artificial intelligence, with growing potential to impact chronic kidney disease (CKD) management. CKD is a complex, highly prevalent condition requiring multifaceted care and substantial patient engagement. Recent developments in LLMs-including conversational AI, multimodal integration, and autonomous agents-offer novel opportunities to enhance patient education, streamline clinical documentation, and support decision-making across nephrology practice.
View Article and Find Full Text PDF