98%
921
2 minutes
20
Background/aims: Large language models (LLMs) have substantial potential to enhance the efficiency of academic research. The accuracy and performance of LLMs in a systematic review, a core part of evidence building, has yet to be studied in detail.
Methods: We introduced two LLM-based approaches of systematic review: an LLM-enabled fully automated approach (LLM-FA) utilising three different GPT-4 plugins (Consensus GPT, Scholar GPT and GPT web browsing modes) and an LLM-facilitated semi-automated approach (LLM-SA) using GPT4's Application Programming Interface (API). We benchmarked these approaches using three published systematic reviews that reported the prevalence of diabetic retinopathy across different populations (general population, pregnant women and children).
Results: The three published reviews consisted of 98 papers in total. Across these three reviews, in the LLM-FA approach, Consensus GPT correctly identified 32.7% (32 out of 98) of papers, while Scholar GPT and GPT4's web browsing modes only identified 19.4% (19 out of 98) and 6.1% (6 out of 98), respectively. On the other hand, the LLM-SA approach not only successfully included 82.7% (81 out of 98) of these papers but also correctly excluded 92.2% of 4497 irrelevant papers.
Conclusions: Our findings suggest LLMs are not yet capable of autonomously identifying and selecting relevant papers in systematic reviews. However, they hold promise as an assistive tool to improve the efficiency of the paper selection process in systematic reviews.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320601 | PMC |
http://dx.doi.org/10.1136/bjo-2024-326254 | DOI Listing |
J Appl Res Intellect Disabil
September 2025
Department of Pedagogy, Faculty of Education and Social Work, University of Valladolid, Valladolid, Spain.
Background: Mental health (MH) problems are more common in people with intellectual disabilities (ID), yet under-diagnosis persists, which may be partly due to a lack of appropriate assessment tools. This study presents a systematic review of instruments used to assess MH problems in Spanish-speaking adults with ID.
Method: Following PRISMA guidelines, a search was conducted in Web of Science, PsycINFO, and Scopus using terms related to ID, MH and assessment.
Aesthet Surg J Open Forum
September 2025
Laser-assisted lipolysis (LAL) for arm fat reduction has gained popularity compared with traditional liposuction. The authors of this study aim to quantify changes in arm circumference through LAL and compare outcomes between treatments with and without suction. A Preferred Reporting Items for Systematic Reviews and Meta-Analyses-compliant systematic review was conducted from inception until May 2024, and meta-analysis was performed using Stata.
View Article and Find Full Text PDFFront Toxicol
August 2025
One Health Research Group, Faculty of Health Science, Universidad de Las Americas, Quito, Ecuador.
Background: Each year, approximately 100 million cases of bee and wasp stings are re-ported globally, with the majority resulting in mild reactions. However, in rarer instances, these stings can lead to severe and potentially fatal outcomes, including ischemic or hemorrhagic cerebral events. This article aims to synthesize and analyze the current evidence on the association between bee and wasp stings and the occurrence of ischemic and hemorrhagic strokes.
View Article and Find Full Text PDFFront Pediatr
August 2025
Department of Neonatal Research, Inova Health Services, Falls Church, VA, United States.
Introduction: Neonatal sepsis is a dysregulated immune response to bloodstream infection causing serious disease and death. Our review seeks to integrate the knowledge gained from studies of multiple molecular methods- such as genomics, metabolomics, transcriptomics, and the gut microbiome- in the setting of neonatal sepsis that may improve the diagnosis, classification, and treatment of the disease. Sepsis claims over 200,000 lives annually worldwide and remains a top 10 cause of infant mortality in the US.
View Article and Find Full Text PDFFront Nutr
August 2025
Faculty of Medicine, Department of Psychiatry, Medical University of Gdańsk, Gdańsk, Poland.
Unlabelled: Mood disorders, including major depressive disorder (MDD) and bipolar disorder (BP), significantly impact global health, with MDD affecting over 300 million people and BP affecting approximately 2% of the world's population. Ketamine, originally an anesthetic, has emerged as a promising treatment for patients with treatment-resistant depression (TRD), due to its unique pharmacological properties, such as N-methyl-D-aspartate (NMDA) receptor antagonism and anti-inflammatory effects. The potential of ketamine in treating depression has sparked debate regarding its effects on appetite.
View Article and Find Full Text PDF