Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: The application of artificial intelligence and large language models in the medical field requires an evaluation of their accuracy in providing medical information. This study aimed to assess the performance of Chat Generative Pre-trained Transformer (ChatGPT) models 3.5 and 4 in solving orthopedic board-style questions.

Methods: A total of 160 text-only questions from the Orthopedic Surgery Department at Seoul National University Hospital, conforming to the format of the Korean Orthopedic Association board certification examinations, were input into the ChatGPT 3.5 and ChatGPT 4 programs. The questions were divided into 11 subcategories. The accuracy rates of the initial answers provided by Chat GPT 3.5 and ChatGPT 4 were analyzed. In addition, inconsistency rates of answers were evaluated by regenerating the responses.

Results: ChatGPT 3.5 answered 37.5% of the questions correctly, while ChatGPT 4 showed an accuracy rate of 60.0% ( < 0.001). ChatGPT 4 demonstrated superior performance across most subcategories, except for the tumor-related questions. The rates of inconsistency in answers were 47.5% for ChatGPT 3.5 and 9.4% for ChatGPT 4.

Conclusions: ChatGPT 4 showed the ability to pass orthopedic board-style examinations, outperforming ChatGPT 3.5 in accuracy rate. However, inconsistencies in response generation and instances of incorrect answers with misleading explanations require caution when applying ChatGPT in clinical settings or for educational purposes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11262944PMC
http://dx.doi.org/10.4055/cios23179DOI Listing

Publication Analysis

Top Keywords

chatgpt
14
orthopedic board-style
12
solving orthopedic
8
chatgpt chatgpt
8
chatgpt accuracy
8
accuracy rate
8
orthopedic
5
questions
5
performance chatgpt
4
chatgpt solving
4

Similar Publications

This study investigates how scientists, educators, and ethics committee members in Türkiye perceive the opportunities and risks posed by generative AI and the ethical implications for science and education. This study uses a 22-question survey developed by the EOSC-Future and RDA AIDV Working Group. The responses were gathered from 62 universities across 208 universities in Türkiye, with a completion rate of 98.

View Article and Find Full Text PDF

Pollution from past industrial activities can remain unnoticed for years or even decades because the pollutant has only recently gained attention or been identified by measurements. Modeling the emission history of pollution is essential for estimating population exposure and apportioning potential liability among stakeholders. This paper proposes a novel approach for reconstructing the history of polychlorinated dibenzo-p-dioxin (PCDD) and polychlorinated dibenzofuran (PCDF) pollution from municipal solid waste incinerators (MSWIs) with unknown past emissions.

View Article and Find Full Text PDF

Objective: Management of aortic stenosis (AS) requires integrating complex clinical, imaging, and risk stratification data. Large language models (LLMs) such as ChatGPT and Gemini AI have shown promise in healthcare, but their performance in valvular heart disease, particularly AS, has not been thoroughly assessed. This study systematically compared ChatGPT and Gemini AI in addressing guideline-based and clinical scenario questions related to AS.

View Article and Find Full Text PDF

Improving ChatGPT's Performance in Orthopedics: Opportunities Using the CRISPE Framework.

JOSPT Methods

June 2025

Department of Physical Therapy, Steinhardt School of Culture, Education, and Human Development, New York University, New York, NY.

ChatGPT has been increasingly used in clinical practice, education, and research. In orthopedic research, ChatGPT's accuracy in clinical decision-making has been a major concern, with results ranging from 33% to 80% accuracy. Inaccuracies from ChatGPT can be harmful to clinicians, trainees, or patients when responses appear plausible, are trusted, and acted upon.

View Article and Find Full Text PDF

Protein kinases are central regulators of cell signaling and play pivotal roles in a wide array of diseases, most notably cancer and autoimmune disorders. The clinical success of kinase inhibitors-such as imatinib and osimertinib-has firmly established kinases as valuable drug targets. However, the development of selective, potent inhibitors remains challenging due to the conserved nature of the ATP-binding site, off-target effects, resistance mutations, and patient-specific variability.

View Article and Find Full Text PDF