98%
921
2 minutes
20
Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations.
Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma.
Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed.
Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa.
Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations.
Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement.
Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets.
Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773068 | PMC |
http://dx.doi.org/10.1016/j.xops.2024.100667 | DOI Listing |
Front Pharmacol
August 2025
Department of Clinical Pharmacy, Meizhou People's Hospital (Huangtang Hospital), Meizhou, China.
Objective: Laronidase is the first drug of enzyme replacement therapy approved for the treatment of mucopolysaccharidosis type I (MPS I). However, its adverse events (AEs) have not been investigated in real - world settings. The aim of this study was to investigate AEs associated with laronidase using the Food and Drug Administration Adverse Event Reporting System (FAERS).
View Article and Find Full Text PDFUnlabelled: Standard Automated Perimetry (SAP) is the mainstay for monitoring glaucoma progression and has been accepted by the U.S. Food and Drug Administration (FDA) as a trial endpoint, but only under stringent criteria of ≥7 dB loss in five pre-specified test locations.
View Article and Find Full Text PDFOphthalmol Glaucoma
September 2025
Department of Ophthalmology, Columbia University Irving Medical Center, New York, New York. Electronic address:
The assessment of the human visual field, a concept explored since ancient Greece, underwent a critical transformation in the 19th century with the advent of objective measurement techniques. Early methodologies concentrated on mapping the outer limits of vision, a practice known as perimetry. However, the focus soon shifted toward campimetry (although the name perimetry remained), which involves assessing defects within the central/paracentral visual field-a crucial development for diagnosing diseases such as glaucoma.
View Article and Find Full Text PDFJ Alzheimers Dis
September 2025
Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA.
BackgroundSlit Guidance Ligand 2 (SLIT2) binds Roundabout (ROBO) guidance receptors to direct axon pathfinding and neuron migration during nervous system development. SLIT2 expression has previously been linked to dementia risk.ObjectiveTo study the association between SLIT2 expression in human vitreous humor and plasma samples and neurocognitive test scores in a cross-sectional cohort study utilizing a novel, highly-sensitive Meso Scale Discovery (MSD) assay for SLIT2 detection.
View Article and Find Full Text PDFComput Biol Med
September 2025
Department of Mathematics, Faculty of Education, Kafkas University, Kars, Turkey. Electronic address:
The increasing prevalence and severity of eye diseases worldwide underscore the urgent need for advanced diagnostic tools and interventions to address the growing burden on global public health. The study on eye disease classification holds significant relevance due to its potential impact on enhancing early detection, diagnosis, and treatment of various ocular conditions. Timely and accurate identification of eye diseases such as cataracts, glaucoma and diabetic retinopathy is crucial for preventing vision loss and improving overall patient outcomes.
View Article and Find Full Text PDF