Artificial Intelligence in Triaging Patient Questions: An Evaluation of a Large Language Model for Distal Radius Fractures.

J Am Acad Orthop Surg

From the University of Colorado School of Medicine, Aurora, CO (Kahan, Wellborn, Lauder, and Federer), Denver Health Medical Center, Denver, CO (Lauder), Duke University School of Medicine, Durham, NC (Berchuck, and Pean), Duke AI Health, Duke University School of Medicine, Durham, NC (Shen), and Re

Published: August 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Introduction: Large language models (LLMs) are promising tools for clinical decision support but require thorough validation to ensure safety and reliability. This study assessed a knowledge and intelligence messaging interface (KIMI; RevelAi Health), an LLM enhanced with retrieval-augmented generation configured with American Academy of Orthopaedic Surgeons guidelines for distal radius fracture management and a persistent system-prompt layer. The goal was to evaluate KIMI's efficacy in acuity triaging and generating appropriate patient-facing responses for distal radius fracture management.

Methods: We analyzed KIMI-generated responses to 100 simulated patient queries. Four clinical experts independently assessed responses for guideline concordance, safety, clarity, and acuity. Probabilities for adequate scoring in all domains were modeled. Bayesian mixed-effects logistic regression and ordered logistic regression models were used for binary and ordinal scoring outcomes, respectively, to account for repeated measures and within-reviewer correlations.

Results: Reviewer evaluations of KIMI responses demonstrated high performance across safety and quality domains. Posterior average probability of responses being rated as safe was 94.2% (95% credible interval [CI]: 91.2 to 96.9), as concordant was 88.7% (95% CI: 85.0 to 92.0), and as clear was 93.7% (95% CI: 90.5 to 96.5). Posterior average probability of exact agreement between reviewer-assigned and LLM-assigned acuity levels was 62.9% (95% CI: 58.0 to 67.7). Surgical queries were associated with slightly higher safety ratings (95.4% versus 91.3%) and acuity agreement (63.9% versus 60.6%) than nonsurgical queries. Query category markedly influenced acuity agreement. LLM-assigned acuity was markedly associated with reviewer-assigned acuity across all models even when adjusting for both query type and category (odds ratio = 2.66; 95% CI: 1.81 to 3.83).

Discussion: KIMI generated responses that were generally safe, clinically concordant, and clearly communicated. These findings support the feasibility of deploying enhanced LLMs for asynchronous patient engagement in low-to-moderate risk care coordination settings.

Download full-text PDF

Source
http://dx.doi.org/10.5435/JAAOS-D-25-00456DOI Listing

Publication Analysis

Top Keywords

distal radius
12
large language
8
radius fracture
8
logistic regression
8
posterior average
8
average probability
8
llm-assigned acuity
8
acuity agreement
8
acuity
7
responses
6

Similar Publications

Use of artificial intelligence for classification of fractures around the elbow in adults according to the 2018 AO/OTA classification system.

BMC Musculoskelet Disord

September 2025

Department of Clinical Sciences at Danderyds Hospital, Department of Orthopedic Surgery, Karolinska Institutet, Stockholm, 182 88, Sweden.

Background: This study evaluates the accuracy of an Artificial Intelligence (AI) system, specifically a convolutional neural network (CNN), in classifying elbow fractures using the detailed 2018 AO/OTA fracture classification system.

Methods: A retrospective analysis of 5,367 radiograph exams visualizing the elbow from adult patients (2002-2016) was conducted using a deep neural network. Radiographs were manually categorized according to the 2018 AO/OTA system by orthopedic surgeons.

View Article and Find Full Text PDF

Articular tuberculosis is a rare condition, with extrapulmonary presentations most commonly appearing in joints such as the hip or knee. It is usually associated with conditions like immunosuppression or a history of pulmonary tuberculosis. Diagnosis involves imaging or pathology, and treatment typically involves surgical intervention along with medication.

View Article and Find Full Text PDF

Background: Evidence supporting surgery in elderly patients with distal radius fractures is limited, but displaced fractures may benefit from surgery. This study aimed to determine whether casting is noninferior to surgery for patients aged 65 years or older with substantially displaced intra-articular (AO type C) distal radius fractures.

Methods: This multicenter randomized controlled noninferiority trial included 138 patients (mean age 76 years, SD 6.

View Article and Find Full Text PDF

Intraarticular osteotomy for adult Madelung deformity: Case report.

JPRAS Open

September 2025

Clínica Cavadas, Paseo de Facultades 1, 46021 Valencia, Spain.

Madelung deformity is a hemi-epiphyseal dysplasia of the radioulnar axis. The prominent feature is radial deformity secondary to premature closure of the volar-ulnar side of the distal radial physics. The distal radius is malaligned with excessive ulnar and volar tilt, shortening and concomitant ulna plus deformity.

View Article and Find Full Text PDF

Introduction: Fractures are a common occurrence in childhood, with approximately one-third of boys and girls sustaining at least one fracture before the age of 17. Both-bone forearm fractures, particularly those involving the radius and ulna, are more common in the non-dominant hand and in boys and usually involve the distal portions of both bones. If not properly treated, these injuries can have a significant impact on limb function.

View Article and Find Full Text PDF