Artificial intelligence in maxillofacial trauma: expert ally or unreliable assistant?

N Agbulut , M Unlu

Med Oral Patol Oral Cir Bucal

Istanbul Medipol University, School of Dentistry Department of Oral and Maxillofacial Surgery Istanbul, Turkey

Published: September 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Background: Large language models (LLMs), such as ChatGPT, have demonstrated potential in synthesizing complex clinical information, yet concerns persist regarding their accuracy and reliability in specialized domains. The rationale of this study is to address a gap in the literature by evaluating ChatGPT-4o's capabilities and limitations in terms of accuracy and reliability on oral and maxillofacial traumatology.

Material And Methods: A total of 188 oral and maxillofacial trauma-related questions were selected from a comprehensive resource. Thirty questions were randomly chosen and submitted to ChatGPT-4o resetting to "new chat" mode every repetition to eliminate potential memory bias. Accuracy was scored using a 3-point Likert scale. Reliability was assessed with weighted kappa (κ) and Intraclass Correlation Coefficient (ICC), and internal consistency was evaluated using both Cronbach's alpha (α) and McDonald's omega (ω).

Results: The accuracy rates for comprehensive and adequate responses were calculated as 38% (95% CI: 32.5% - 43.5%) and 58% (95% CI: 52.1% - 63.3%), respectively. Weighted kappa (κ = 0.469) and ICC (0.503) indicated moderate reliability. Internal consistency metrics revealed excellent and good reliability, respectively (α = 0.904, ω = 0.860).

Conclusions: ChatGPT-4o demonstrated promising results as an adjunct tool in providing supplementary educational content, verifying critical information, and supporting the decision-making processes in oral and maxillofacial traumatology. Current limitations warrant further research. Future enhancements in LLMs and prompt engineering may assist in the optimization of their clinical applicability and alignment with evidence-based standards.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12395565	PMC
http://dx.doi.org/10.4317/medoral.27229	DOI Listing

Publication Analysis

Top Keywords

oral maxillofacial

accuracy reliability

weighted kappa

internal consistency

reliability

artificial intelligence

maxillofacial

intelligence maxillofacial

maxillofacial trauma

trauma expert

A PHP Error was encountered