Interrater agreement in the evaluation of discrepant imaging findings with the Radpeer system.

Leila C Bender , Ken F Linnau , Eric N Meier , Yoshimi Anzai , Martin L Gunn

AJR Am J Roentgenol

Department of Radiology, University of Washington, Box 359728, 325 9th Ave, Seattle, WA 98104, USA.

Published: December 2012

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Objective: The Radpeer system is central to the quality assurance process in many radiology practices. Previous studies have shown poor agreement between physicians in the evaluation of their peers. The purpose of this study was to assess the reliability of the Radpeer scoring system.

Materials And Methods: A sample of 25 discrepant cases was extracted from our quality assurance database. Images were made anonymous; associated reports and identities of interpreting radiologists were removed. Indications for the studies and descriptions of the discrepancies were provided. Twenty-one subspecialist attending radiologists rated the cases using the Radpeer scoring system. Multirater kappa statistics were used to assess interrater agreement, both with the standard scoring system and with dichotomized scores to reflect the practice of further review for cases rated 3 and 4. Subgroup analyses were conducted to assess subspecialist evaluation of cases.

Results: Interrater agreement was slight to fair compared with that expected by chance. For the group of 21 raters, the kappa values were 0.11 (95% CI, 0.06-0.16) with the standard scoring system and 0.20 (95% CI, 0.13-0.27) with dichotomized scores. There was disagreement about whether a discrepancy had occurred in 20 cases. Subgroup analyses did not reveal significant differences in the degree of interrater agreement.

Conclusion: The identification of discrepant interpretations is valuable for the education of individual radiologists and for larger-scale quality assurance and quality improvement efforts. Our results show that a ratings-based peer review system is unreliable and subjective for the evaluation of discrepant interpretations. Resources should be devoted to developing more robust and objective assessment procedures, particularly those with clear quality improvement goals.

Download full-text PDF	Source
http://dx.doi.org/10.2214/AJR.12.8972	DOI Listing

Publication Analysis

Top Keywords

interrater agreement

quality assurance

scoring system

evaluation discrepant

radpeer system

radpeer scoring

standard scoring

dichotomized scores

subgroup analyses

discrepant interpretations

Similar Publications

Concordance of symptoms perceived by patients receiving haemodialysis and those reported by nurses and nephrologists: a cross-sectional, multicentre, observational study using the REIN registry.

Clin Kidney J

September 2025

Department of Nephrology, CHU Lyon, Lyon, France.

Abdallah Guerraoui , Julie Haesebaert , Fabien Subtil , William Hanf , Caroline Pelletier

Background: Patients receiving haemodialysis (HD) experience symptoms that impact quality of life. This study assessed the concordance of symptoms and symptom severity of HD patients and their perception by nurses and nephrologists.

Methods: A cross-sectional, observational study using the 30-item Dialysis Symptom Index (DSI) questionnaire was conducted in six dialysis centres in France from 1 March 2022 to 30 June 2023.

View Article and Find Full Text PDF

Similar Publications

Development and preliminary inter-rater reliability of the new PROOF tool to measure fidelity of problem-solving therapy for depression delivered by non-specialists in a low-resource African setting.

Glob Ment Health (Camb)

July 2025

Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.

Lily Cooke , Tarisai Bere , Amelia Stanton , Walter Mangezi , Steven A Safren

Problem-solving therapy (PST) is a brief psychological intervention often implemented for depression. Currently, there are no tools with well-evidenced reliability to measure PST fidelity. This pilot study aimed to measure the inter-rater reliability and agreement of the blem-Slving Therapy idelity (PROOF) scale, comprising binary 14-item adherence and an 8-item competence subscales.

View Article and Find Full Text PDF

Similar Publications

Blinded gait assessment in idiopathic normal pressure hydrocephalus: reliability and correlation with clinical and patient-reported outcomes.

Fluids Barriers CNS

September 2025

Department of Medical Sciences, Neurology, Uppsala University, Uppsala, Sweden.

Maria Ekblom , Dag Nyholm , Lena Zetterberg , Katarina Laurell , Johan Virhammar

Background: Idiopathic normal pressure hydrocephalus (iNPH) predominantly manifests with gait disturbances, yet clinical assessments are vulnerable to confirmation bias, particularly post-shunt surgery. Blinded video evaluations are a method to enhance objectivity in gait assessment, but their reliability has never been systematically investigated. The aim was to evaluate the inter-rater reliability of blinded gait assessments in iNPH patients and to investigate how these assessments correlate with the Hellström iNPH scale and patient-reported health status following shunt surgery.

View Article and Find Full Text PDF

Similar Publications

Lung volume segmentation in fetal MRI: super-resolution reconstructions improve inter-rater reliability.

Eur Radiol Exp

September 2025

Center for MR-Research, University Children's Hospital Zurich, University of Zurich, Zurich, Switzerland.

Kelly Payette , Julia Geiger , Michael Zellner , Céline Steger , Christian J Kellenberger

Background: Fetal MRI is increasingly used to investigate fetal lung pathologies, and super-resolution (SR) algorithms could be a powerful clinical tool for this assessment. Our goal was to investigate whether SR reconstructions result in an improved agreement in lung volume measurements determined by different raters, also known as inter-rater reliability.

Materials And Methods: In this single-center retrospective study, fetal lung volumes calculated from both SR reconstructions and the original images were analyzed.

View Article and Find Full Text PDF

Similar Publications

Involuntary movements in patients with impaired awareness: A comparative study of phenomenology and neurophysiological evaluation.

Seizure

August 2025

Serviço de Neurologia, Departamento de Neurociências e Saúde Mental, Hospital Santa Maria, Unidade Local de Saúde Santa Maria, Lisboa, Portugal; Centro de Estudos Egas Moniz, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal; Laboratório de EEG/Sono, Serviço de Neurologi

Pedro Coelho , Linda Azevedo Kauppila , Ana Catarina Franco , Carla Bentes , Anabela Valadas

Introduction: Subtle involuntary movements in patients with impaired awareness may suggest non-convulsive status epilepticus (NCSE), but their diagnostic accuracy is unclear. Since electroencephalography (EEG) is not always available, clinicians often rely on motor signs for early diagnosis. We aimed to characterize these movements and evaluate interrater agreement and diagnostic accuracy among specialists.

View Article and Find Full Text PDF

Similar Publications