Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

: Artificial intelligence (AI), particularly large language models (LLMs), has demonstrated versatility in various applications but faces challenges in specialized domains like neurology. This study evaluates a specialized LLM's capability and trustworthiness in complex neurological diagnosis, comparing its performance to neurologists in simulated clinical settings. : We deployed GPT-4 Turbo (OpenAI, San Francisco, CA, US) through Neura (Sciense, New York, NY, US), an AI infrastructure with a dual-database architecture integrating "long-term memory" and "short-term memory" components on a curated neurological corpus. Five representative clinical scenarios were presented to 13 neurologists and the AI system. Participants formulated differential diagnoses based on initial presentations, followed by definitive diagnoses after receiving conclusive clinical information. Two senior academic neurologists blindly evaluated all responses, while an independent investigator assessed the verifiability of AI-generated information. : AI achieved a significantly higher normalized score (86.17%) compared to neurologists (55.11%, < 0.001). For differential diagnosis questions, AI scored 85% versus 46.15% for neurologists, and for final diagnosis, 88.24% versus 70.93%. AI obtained 15 maximum scores in its 20 evaluations and responded in under 30 s compared to neurologists' average of 9 min. All AI-provided references were classified as relevant with no hallucinatory content detected. : A specialized LLM demonstrated superior diagnostic performance compared to practicing neurologists across complex clinical challenges. This indicates that appropriately harnessed LLMs with curated knowledge bases can achieve domain-specific relevance in complex clinical disciplines, suggesting potential for AI as a time-efficient asset in clinical practice.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12025783PMC
http://dx.doi.org/10.3390/brainsci15040347DOI Listing

Publication Analysis

Top Keywords

large language
8
neurologists complex
8
complex clinical
8
neurologists
7
clinical
6
specialized
4
specialized large
4
language model
4
model outperforms
4
outperforms neurologists
4

Similar Publications

A plain language summary of the MIRACLE study: benralizumab in people in Asia with severe asthma.

Immunotherapy

September 2025

aGuangzhou Institute of Respiratory Health, State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, National Center for Respiratory Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.

View Article and Find Full Text PDF

Applications driven by large language models (LLMs) are reshaping higher education by offering innovative tools that enhance learning, streamline administrative tasks, and support scholarly work. However, their integration into education institutions raises ethical concerns related to bias, misinformation, and academic integrity, necessitating thoughtful institutional responses. This article explores the evolving role of LLMs in midwifery higher education, providing historical context, key capabilities, and ethical considerations.

View Article and Find Full Text PDF

A growing literature explores the representational detail of infants' early lexical representations, but no study has investigated how exposure to real-life acoustic-phonetic variation impacts these representations. Indeed, previous experimental work with young infants has largely ignored the impact of accent exposure on lexical development. We ask how routine exposure to accent variation affects 6-month-olds' ability to detect mispronunciations.

View Article and Find Full Text PDF

Objectives: The primary aim of this study was to compare resource utilization between lower and higher-risk brief resolved unexplained events (BRUE) in the general (GED) and pediatric (PED) emergency departments.

Methods: We conducted a retrospective chart review of BRUE cases from a large health system over 6-and-a-half years. Our primary outcome was the count of diagnostic tests per encounter.

View Article and Find Full Text PDF

Large language models (LLMs) have been successfully used for data extraction from free-text radiology reports. Most current studies were conducted with LLMs accessed via an application programming interface (API). We evaluated the feasibility of using open-source LLMs, deployed on limited local hardware resources for data extraction from free-text mammography reports, using a common data element (CDE)-based structure.

View Article and Find Full Text PDF