98%
921
2 minutes
20
Objectives: Eye-related conditions are a prevalent issue that continues to grow worldwide, affecting the sight of at least 2.2 billion individuals globally. Many patients may have questions or concerns that they bring to the internet before their healthcare provider, which can impact their health behavior. With the popularity of large language model (LLM)-based artificial intelligence (AI) chat platforms, like ChatGPT, there needs to be a better understanding of the suitability of their generated content. We aim to evaluate ChatGPT for the accuracy, comprehensiveness, and readability of its responses to ophthalmology-related medical inquiries.
Methodology: Twenty-two ophthalmology patient questions were generated based on commonly searched symptoms on Google Trends and used as inputs on ChatGPT. Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) formulas were used to evaluate response readability. Two English-speaking, board-certified ophthalmologists evaluated the accuracy, comprehensiveness, and clarity of the responses as proxies for appropriateness. Other validated tools, including QUEST, DISCERN, and an urgency scale, were used for additional quality metrics. Responses were analyzed using descriptive statistics and comparative tests. Results: All responses scored a 2.0 for QUEST Tone and 1.0 for Complementarity. DISCERN Uncertainty had a mean of 3.86 ± 0.48, with no responses receiving a 5. Urgency to seek care scores averaged 2.45 ± 0.60, with only the narrow-angle glaucoma response prompting an ambulance call. Readability scores resulted in a mean FRE of 45.3 ± 9.98 and FKGL of 10.1 ± 1.74. These quality assessment scores showed no significant differences between categories of conditions. The ophthalmologists' reviews rated 15/22 (68.18%) of responses as appropriate. The mean scores for accuracy, comprehensiveness, and clarity were 4.41 ± 0.73, 4.89 ± 0.32, and 4.55 ± 0.63, respectively, with comprehensiveness ranking significantly higher than the other aspects (< 0.01). The responses for glaucoma and cataract had the lowest appropriateness ratings.
Conclusions: ChatGPT generally demonstrated appropriate responses to common ophthalmology questions, with high ratings for comprehensiveness, clarity, and support for medical professional follow-up. Performance did vary by conditions, with weaker appropriateness in responses related to glaucoma and cataract.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12349890 | PMC |
http://dx.doi.org/10.7759/cureus.87920 | DOI Listing |
J Histotechnol
September 2025
Department of Pathology, Peking University Third Hospital, Beijing, China.
Amyloidosis encompasses a spectrum of rare disorders characterized by extracellular amyloid deposition. Achieving an accurate early diagnosis of systemic amyloidosis necessitates biopsy-specific pathological evaluation. Formalin-fixed, paraffin-embedded liver biopsy specimens were examined using Congo red staining, electron microscopy, immunohistochemistry (IHC), immunofluorescence, and Congo red-assisted laser microdissection with mass spectrometry (LMD/MS).
View Article and Find Full Text PDFClin Anat
September 2025
Division in Anatomy and Developmental Biology, Department of Oral Biology, Human Identification Research Institute, BK21 FOUR Project, Yonsei University College of Dentistry, Seoul, South Korea.
Plantar melanomas present unique diagnostic and surgical challenges owing to substantial regional variations in skin thickness. Although the Breslow thickness remains the primary criterion for staging and surgical excision, its application on plantar melanoma is complicated by the inherent thickness of the glabrous plantar epidermis, which may lead to tumor depth overestimation. Accurate assessment of plantar skin thickness is essential for optimizing staging accuracy and refining surgical margins.
View Article and Find Full Text PDFFront Rehabil Sci
August 2025
Department of Neurosurgery, David Geffen School of Medicine, University of California, Los Angeles, CA, United States.
Introduction: Spinal cord injury (SCI) presents a significant burden to patients, families, and the healthcare system. The ability to accurately predict functional outcomes for SCI patients is essential for optimizing rehabilitation strategies, guiding patient and family decision making, and improving patient care.
Methods: We conducted a retrospective analysis of 589 SCI patients admitted to a single acute rehabilitation facility and used the dataset to train advanced machine learning algorithms to predict patients' rehabilitation outcomes.
Int J Chron Obstruct Pulmon Dis
September 2025
The First Clinical Medical College of Lanzhou University, Lanzhou, People's Republic of China.
Chronic Obstructive Pulmonary Disease (COPD) is a prevalent chronic respiratory disorder characterized by airway inflammation and irreversible airflow limitation. Its marked heterogeneity and complexity pose significant challenges to traditional clinical assessments in terms of prognostic prediction and personalized management. In recent years, the exploration of biomarkers has opened new avenues for the precise evaluation of COPD, particularly through multi-biomarker prediction models and integrative multimodal data strategies, which have substantially improved the accuracy and reliability of prognostic assessments.
View Article and Find Full Text PDFFront Genet
August 2025
Center for Applied Genetic Technologies, University of Georgia, Athens, GA, United States.
This study introduces a Drought Adaptation Index (DAI), derived from Best Linear Unbiased Prediction (BLUP), as a method to assess drought resilience in switchgrass ( L.). A panel of 404 genotypes was evaluated under drought-stressed (CV) and well-watered (UC) conditions over four consecutive years (2019-2022).
View Article and Find Full Text PDF