98%
921
2 minutes
20
We evaluated the efficacy of large language models (LLMs), specifically, generative pre-trained transformer-4 (GPT-4), in predicting pregnancy following in vitro fertilization (IVF) treatment and compared its accuracy with results from an original published study. Our findings revealed that GPT-4 can autonomously develop and refine advanced machine learning models for pregnancy prediction with minimal human intervention. The prediction accuracy was 0.79, and the area under the receiver operating characteristic curve (AUROC) was 0.89, exceeding or being at least equivalent to the metrics reported in the original study, that is, 0.78 for accuracy and 0.87 for AUROC. The results suggest that LLMs can facilitate data processing, optimize machine learning models in predicting IVF success rates, and provide data interpretation methods. This capacity can help bridge the knowledge gap between data scientists and medical personnel to solve the most pressing clinical challenges. However, more experiments on diverse and larger datasets are needed to validate and promote broader applications of LLMs in assisted reproduction.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11683553 | PMC |
http://dx.doi.org/10.1111/aogs.14989 | DOI Listing |
Drug Saf
September 2025
The MITRE Corporation, 202 Burlington Rd, Bedford, MA, 01730, USA.
Acta Neurochir (Wien)
September 2025
Department of Neurosurgery, Istinye University, Istanbul, Turkey.
Background: Recent studies suggest that large language models (LLMs) such as ChatGPT are useful tools for medical students or residents when preparing for examinations. These studies, especially those conducted with multiple-choice questions, emphasize that the level of knowledge and response consistency of the LLMs are generally acceptable; however, further optimization is needed in areas such as case discussion, interpretation, and language proficiency. Therefore, this study aimed to evaluate the performance of six distinct LLMs for Turkish and English neurosurgery multiple-choice questions and assess their accuracy and consistency in a specialized medical context.
View Article and Find Full Text PDFArch Gynecol Obstet
September 2025
Department of Obstetrics and Gynaecology, IRCCS San Raffaele Scientific Institute, 20132, Milan, Italy.
Objectives: Recommendations regarding the use of third-trimester ultrasound lack universal consensus. Yet, there is evidence which supports its value in assessing fetal growth, fetal well-being, and a number of pregnancy-related complications. This literature review evaluates the available scientific evidence regarding its applications, usefulness, and the timing of the third-trimester scan in a low-risk population.
View Article and Find Full Text PDFJ Glaucoma
September 2025
Harvard Medical School, Boston, MA.
Purpose: Large language models (LLMs) can assist patients who seek medical knowledge online to guide their own glaucoma care. Understanding the differences in LLM performance on glaucoma-related questions can inform patients about the best resources to obtain relevant information.
Methods: This cross-sectional study evaluated the accuracy, comprehensiveness, quality, and readability of LLM-generated responses to glaucoma inquiries.
Disabil Rehabil Assist Technol
September 2025
International Communication College, Jilin International Studies University, Changchun, Jilin, China.
Background: Conventional automated writing evaluation systems typically provide insufficient support for students with special needs, especially in tonal language acquisition such as Chinese, primarily because of rigid feedback mechanisms and limited customisation.
Objective: This research develops context-aware Hierarchical AI Tutor for Writing Enhancement(CHATWELL), an intelligent tutoring platform that incorporates optimised large language models to deliver instantaneous, customised, and multi-dimensional writing assistance for Chinese language learners, with special consideration for those with cognitive learning barriers.
Methods: CHATWELL employs a hierarchical AI framework with a four-tier feedback mechanism designed to accommodate diverse learning needs.