98%
921
2 minutes
20
With the increasing application of large language models (LLMs) in the medical field, their potential in patient education and clinical decision support is becoming increasingly prominent. Given the complex pathogenesis, diverse treatment options, and lengthy rehabilitation periods of spinal cord injury (SCI), patients are increasingly turning to advanced online resources to obtain relevant medical information. This study analyzed responses from four LLMs-ChatGPT-4o, Claude-3.5 sonnet, Gemini-1.5 Pro, and Llama-3.1-to 37 SCI-related questions spanning pathogenesis, risk factors, clinical features, diagnostics, treatments, and prognosis. Quality and readability were assessed using the Ensuring Quality Information for Patients (EQIP) tool and Flesch-Kincaid metrics, respectively. Accuracy was independently scored by three senior spine surgeons using consensus scoring. Performance varied among the models. Gemini ranked highest in EQIP scores, suggesting superior information quality. Although the readability of all four LLMs was generally low, requiring a college-level reading comprehension ability, they were all able to effectively simplify complex content. Notably, ChatGPT led in accuracy, achieving significantly higher "Good" ratings (83.8%) compared to Claude (78.4%), Gemini (54.1%), and Llama (62.2%). Comprehensiveness scores were high across all models. Furthermore, the LLMs exhibited strong self-correction abilities. After being prompted for revision, the accuracy of ChatGPT and Claude's responses improved by 100% and 50%, respectively; both Gemini and Llama improved by 67%. This study represents the first systematic comparison of leading LLMs in the context of SCI. While Gemini excelled in response quality, ChatGPT provided the most accurate and comprehensive responses.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s10916-025-02170-7 | DOI Listing |
J Med Internet Res
September 2025
Department of Information Systems and Cybersecurity, The University of Texas at San Antonio, 1 UTSA Circle, San Antonio, TX, 78249, United States, 1 (210) 458-6300.
Background: Adverse drug reactions (ADR) present significant challenges in health care, where early prevention is vital for effective treatment and patient safety. Traditional supervised learning methods struggle to address heterogeneous health care data due to their unstructured nature, regulatory constraints, and restricted access to sensitive personal identifiable information.
Objective: This review aims to explore the potential of federated learning (FL) combined with natural language processing and large language models (LLMs) to enhance ADR prediction.
JMIR Med Inform
September 2025
Department of Hepatobiliary and Vascular Surgery, First Affiliated Hospital of Chengdu Medical College, Chengdu, China.
Background: Primary liver cancer, particularly hepatocellular carcinoma (HCC), poses significant clinical challenges due to late-stage diagnosis, tumor heterogeneity, and rapidly evolving therapeutic strategies. While systematic reviews and meta-analyses are essential for updating clinical guidelines, their labor-intensive nature limits timely evidence synthesis.
Objective: This study proposes an automated literature screening workflow powered by large language models (LLMs) to accelerate evidence synthesis for HCC treatment guidelines.
J Speech Lang Hear Res
September 2025
Department of Communication Sciences and Disorders, University of Wisconsin-Madison.
Purpose: Speech disfluencies are common in individuals who do not stutter, with estimates suggesting a typical rate of six per 100 words. Factors such as language ability, processing load, planning difficulty, and communication strategy influence disfluency. Recent work has indicated that bilinguals may produce more disfluencies than monolinguals, but the factors underlying disfluency in bilingual children are poorly understood.
View Article and Find Full Text PDFPLoS One
September 2025
School of Law, Society and Criminology, UNSW, Sydney, New South Wales, Australia.
In this paper we analyse gender-based biases in the language within complex legal judgments. Our aims are: (i) to determine the extent to which purported biases discussed in the literature by feminist legal scholars are identifiable from the language of legal judgments themselves, and (ii) to uncover new forms of bias represented in the data that may promote further analysis and interpretation of the functioning of the legal system. We consider a large set of 2530 judgments in family law in Australia over a 20 year period, examining the way that male and female parties to a case are spoken to and about, by male and female judges, in relation to their capacity to provide care for children subject to the decision.
View Article and Find Full Text PDFPLOS Digit Health
September 2025
Department of Dermatology, Stanford University, Stanford, California, United States of America.
Large Language Models (LLMs) are increasingly deployed in clinical settings for tasks ranging from patient communication to decision support. While these models demonstrate race-based and binary gender biases, anti-LGBTQIA+ bias remains understudied despite documented healthcare disparities affecting these populations. In this work, we evaluated the potential of LLMs to propagate anti-LGBTQIA+ medical bias and misinformation.
View Article and Find Full Text PDF