98%
921
2 minutes
20
Large language models (LLMs) can potentially enhance the accessibility and quality of medical information. This study evaluates the reliability and quality of responses generated by ChatGPT-4, an LLM-driven chatbot, compared to those written by physicians, focusing on otorhinolaryngological advice in real-world, text-based workflows. Responses from a public social media forum were anonymized, and ChatGPT-4 generated corresponding replies. A panel of seven board-certified otorhinolaryngologists assessed both sets of responses using six criteria: overall quality, empathy, alignment with medical consensus, information accuracy, inquiry comprehension, and harm potential. Ordinal logistic regression analysis identified factors influencing response quality. ChatGPT-4 responses were preferred in 70.7% of cases and were significantly longer (median: 162 words) than physician responses (median: 67 words; P < .0001). The chatbot's responses received higher ratings across all criteria, with key predictors of this higher quality being greater empathy, stronger alignment with medical consensus, lower potential for harm, and fewer inaccuracies. ChatGPT-4 consistently outperformed physicians in generating responses that adhered to medical consensus, demonstrated accuracy, and conveyed empathy. These findings suggest that integrating AI tools into text-based healthcare consultations could help physicians better address complex, nuanced inquiries and provide high-quality, comprehensive medical advice.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12215459 | PMC |
http://dx.doi.org/10.1038/s41598-025-06769-1 | DOI Listing |
J Imaging Inform Med
September 2025
Department of Diagnostic, Interventional and Pediatric Radiology (DIPR), Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland.
Large language models (LLMs) have been successfully used for data extraction from free-text radiology reports. Most current studies were conducted with LLMs accessed via an application programming interface (API). We evaluated the feasibility of using open-source LLMs, deployed on limited local hardware resources for data extraction from free-text mammography reports, using a common data element (CDE)-based structure.
View Article and Find Full Text PDFVirchows Arch
September 2025
Department of Oral Surgery and Pathology, School of Dentistry, Universidade Federal de Minas Gerais, Minas Gerais, Av. Antônio Carlos, Pampulha, Belo Horizonte, 31270-901, Brazil.
Plasmablastic lymphoma (PBL) is a rare and aggressive non-Hodgkin lymphoma with a poor prognosis and short survival rates. It is classified as a large B-cell lymphoma subtype, but carries a plasmacytic immunophenotype. Therefore, PBL has pathogenetic overlaps with diffuse large B-cell lymphoma not otherwise specified (DLBCL NOS) and plasma cell neoplasms (PCNs).
View Article and Find Full Text PDFEur Spine J
September 2025
Centre Hospitalier Universitaire de Tours, Tours, France.
Purpose: Degenerative lumbar spinal stenosis (DLSS) represents an increasing challenge due to the aging population. The natural course of untreated DLSS is largely unknown. For the acute DLSS decompensations, the main concern remains the opportunity and timing of surgery, i.
View Article and Find Full Text PDFNat Hum Behav
September 2025
Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China.
Understanding how sentences are represented in the human brain, as well as in large language models (LLMs), poses a substantial challenge for cognitive science. Here we develop a one-shot learning task to investigate whether humans and LLMs encode tree-structured constituents within sentences. Participants (total N = 372, native Chinese or English speakers, and bilingual in Chinese and English) and LLMs (for example, ChatGPT) were asked to infer which words should be deleted from a sentence.
View Article and Find Full Text PDFEndocr J
September 2025
Institute of Liberal Arts and Science, Kanazawa University, Kanazawa, Japan.
GPT-4o, a general-purpose large language model, has a Retrieval-Augmented Variant (GPT-4o-RAG) that can assist in dietary counseling. However, research on its application in this field remains lacking. To bridge this gap, we used the Japanese National Examination for Registered Dietitians as a standardized benchmark for evaluation.
View Article and Find Full Text PDF