AI-generated multiple-choice questions in health science education: Stakeholder perspectives and implementation considerations.

Matthew Reid , Michelle French , Stavroula Andreopoulos , Christine Wong , Nohjin Kee

Curr Res Physiol

University of Toronto, Department of Physiology, Temerty Faculty of Medicine, Medical Sciences Building 3rd Floor, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Multiple-choice questions (MCQs) are widely used in health science education because they are an efficient way to evaluate knowledge from simple recall to complex clinical reasoning. The creation of high-quality MCQs, however, can be time-consuming and requires expertise in question composition. Advancements in artificial intelligence (AI), especially large language models (LLMs), offer the potential to allow for the rapid generation of high-quality, consistent, and course-specific MCQs. Here we discuss the potential benefits and drawbacks of the use of this technology in the generation of MCQs, including ensuring the accuracy and fairness of questions, along with technical, ethical, and privacy considerations. We offer practical guiding principles for the implementation of AI-generated MCQs and outline future research areas related to their impact on student learning and educational quality.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12340502	PMC
http://dx.doi.org/10.1016/j.crphys.2025.100160	DOI Listing

Publication Analysis

Top Keywords

multiple-choice questions

health science

science education

mcqs

ai-generated multiple-choice

questions health

education stakeholder

stakeholder perspectives

perspectives implementation

implementation considerations

Similar Publications

Individual and group reflection in lecture-based large groups lead to comparable learning success.

Wien Klin Wochenschr

September 2025

Medizinische Klinik und Poliklinik IV, LMU-Klinikum München, München, Germany.

Clemens Höbaus , Ralf Schmidmaier

Objective: The study aims to elucidate a possible effect of individual reflection (IR) or group reflection (GR) on short-term and long-term memory retention in a large group lecture-based environment.

Methods: In this quasi-experimental study 656 medical students were enrolled to compare the impact of IR and GR directly after the lectures and 2 months later. Students were divided into two groups and given two different lectures using IR or GR in a cross-over fashion.

View Article and Find Full Text PDF

Similar Publications

Comparative performance of neurosurgery-specific, peer-reviewed versus general AI chatbots in bilingual board examinations: evaluating accuracy, consistency, and error minimization strategies.

Acta Neurochir (Wien)

September 2025

Department of Neurosurgery, Istinye University, Istanbul, Turkey.

Mahmut Çamlar , Umut Tan Sevgi , Gökberk Erol , Furkan Karakaş , Yücel Doğruel

Background: Recent studies suggest that large language models (LLMs) such as ChatGPT are useful tools for medical students or residents when preparing for examinations. These studies, especially those conducted with multiple-choice questions, emphasize that the level of knowledge and response consistency of the LLMs are generally acceptable; however, further optimization is needed in areas such as case discussion, interpretation, and language proficiency. Therefore, this study aimed to evaluate the performance of six distinct LLMs for Turkish and English neurosurgery multiple-choice questions and assess their accuracy and consistency in a specialized medical context.

View Article and Find Full Text PDF

Similar Publications

Time Allotted for Examination Item Types in Nursing Education.

J Nurs Educ

September 2025

Wolters Kluwer Health, New York, New York; and.

Vicki Moran , Sheila Chery , Heidi Israel , Olivia Moran

Background: Examinations are used widely in nursing education to evaluate knowledge attainment. New item types were initiated in April 2023 by the National Council of State Boards of Nursing (NCSBN) for use on the Next Generation National Council Licensure Examination for Registered Nurses (NGN NCLEX-RN). Little evidence exists for how much time is needed for exams that use the new item types.

View Article and Find Full Text PDF

Similar Publications

Assistive technology in current occupational therapy practice.

Assist Technol

September 2025

Occupational Therapy Doctorate, Gannon University, Ruskin, Florida, USA.

Karen Dishman , Blair Carsone , Juliana Bell , Leslie J Hardman , Olivia Fulton

Previous research found that occupational therapy practitioners desired more training in assistive technology. This study provides further evidence on which assistive technology categories should be included in the education of occupational therapists in the United States, based on the practice setting. Participants were recruited through snowball sampling and were included if they were certified occupational therapists practicing in the United States.

View Article and Find Full Text PDF

Similar Publications

Comparison of GPT-4o and o3-Mini on Otolaryngology USMLE-Style Questions.

J Craniofac Surg

September 2025

University of Miami Miller School of Medicine, Miami, FL.

Soumil Prasad , Levi Travis , Mason Thornton , Seth Thaller

Outcomes were to compare the accuracy of 2 large-language models-GPT-4o and o3-Mini-against medical-student performance on otolaryngology-focused, USMLE-style multiple-choice questions. With permission from AMBOSS, we extracted 146 Step 2 CK questions tagged "Otolaryngology" and stratified them by AMBOSS difficulty (levels 1-5). Each item was presented verbatim to GPT-4o and o3-Mini through their official APIs; outputs were scored correct/incorrect.

View Article and Find Full Text PDF

Similar Publications