98%
921
2 minutes
20
Objective: The Radpeer system is central to the quality assurance process in many radiology practices. Previous studies have shown poor agreement between physicians in the evaluation of their peers. The purpose of this study was to assess the reliability of the Radpeer scoring system.
Materials And Methods: A sample of 25 discrepant cases was extracted from our quality assurance database. Images were made anonymous; associated reports and identities of interpreting radiologists were removed. Indications for the studies and descriptions of the discrepancies were provided. Twenty-one subspecialist attending radiologists rated the cases using the Radpeer scoring system. Multirater kappa statistics were used to assess interrater agreement, both with the standard scoring system and with dichotomized scores to reflect the practice of further review for cases rated 3 and 4. Subgroup analyses were conducted to assess subspecialist evaluation of cases.
Results: Interrater agreement was slight to fair compared with that expected by chance. For the group of 21 raters, the kappa values were 0.11 (95% CI, 0.06-0.16) with the standard scoring system and 0.20 (95% CI, 0.13-0.27) with dichotomized scores. There was disagreement about whether a discrepancy had occurred in 20 cases. Subgroup analyses did not reveal significant differences in the degree of interrater agreement.
Conclusion: The identification of discrepant interpretations is valuable for the education of individual radiologists and for larger-scale quality assurance and quality improvement efforts. Our results show that a ratings-based peer review system is unreliable and subjective for the evaluation of discrepant interpretations. Resources should be devoted to developing more robust and objective assessment procedures, particularly those with clear quality improvement goals.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.2214/AJR.12.8972 | DOI Listing |
Clin Kidney J
September 2025
Department of Nephrology, CHU Lyon, Lyon, France.
Background: Patients receiving haemodialysis (HD) experience symptoms that impact quality of life. This study assessed the concordance of symptoms and symptom severity of HD patients and their perception by nurses and nephrologists.
Methods: A cross-sectional, observational study using the 30-item Dialysis Symptom Index (DSI) questionnaire was conducted in six dialysis centres in France from 1 March 2022 to 30 June 2023.
Glob Ment Health (Camb)
July 2025
Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
Problem-solving therapy (PST) is a brief psychological intervention often implemented for depression. Currently, there are no tools with well-evidenced reliability to measure PST fidelity. This pilot study aimed to measure the inter-rater reliability and agreement of the blem-Slving Therapy idelity (PROOF) scale, comprising binary 14-item adherence and an 8-item competence subscales.
View Article and Find Full Text PDFFluids Barriers CNS
September 2025
Department of Medical Sciences, Neurology, Uppsala University, Uppsala, Sweden.
Background: Idiopathic normal pressure hydrocephalus (iNPH) predominantly manifests with gait disturbances, yet clinical assessments are vulnerable to confirmation bias, particularly post-shunt surgery. Blinded video evaluations are a method to enhance objectivity in gait assessment, but their reliability has never been systematically investigated. The aim was to evaluate the inter-rater reliability of blinded gait assessments in iNPH patients and to investigate how these assessments correlate with the Hellström iNPH scale and patient-reported health status following shunt surgery.
View Article and Find Full Text PDFEur Radiol Exp
September 2025
Center for MR-Research, University Children's Hospital Zurich, University of Zurich, Zurich, Switzerland.
Background: Fetal MRI is increasingly used to investigate fetal lung pathologies, and super-resolution (SR) algorithms could be a powerful clinical tool for this assessment. Our goal was to investigate whether SR reconstructions result in an improved agreement in lung volume measurements determined by different raters, also known as inter-rater reliability.
Materials And Methods: In this single-center retrospective study, fetal lung volumes calculated from both SR reconstructions and the original images were analyzed.
Seizure
August 2025
Serviço de Neurologia, Departamento de Neurociências e Saúde Mental, Hospital Santa Maria, Unidade Local de Saúde Santa Maria, Lisboa, Portugal; Centro de Estudos Egas Moniz, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal; Laboratório de EEG/Sono, Serviço de Neurologi
Introduction: Subtle involuntary movements in patients with impaired awareness may suggest non-convulsive status epilepticus (NCSE), but their diagnostic accuracy is unclear. Since electroencephalography (EEG) is not always available, clinicians often rely on motor signs for early diagnosis. We aimed to characterize these movements and evaluate interrater agreement and diagnostic accuracy among specialists.
View Article and Find Full Text PDF