Achieving Inter-Rater Agreement and Inter-Rater Reliability to Assess Fidelity of an Occupation-Based Coaching (OBC) Clinical Trial Intervention.

Amy Ann Abbott , Julia Shin , Kathryn Carlson , Marion Russell , Yongyue Qi , Hannah Storm , Vanessa Dawn Jewell

Br J Occup Ther

University of North Carolina, Chapel Hill, NC, USA.

Published: March 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Introduction: Establishing inter-rater agreement and reliability ascertains that multiple raters consistently evaluate observed interventions to ensure that clinical research protocols are delivered as intended by the trial protocol.

Purpose: Using the Guidelines for Reporting Reliability and Agreement Studies, we (a) exemplified the steps to establish inter-rater reliability and inter-rater agreement on the occupation-based coaching Video Evaluation Tool and (b) evaluated best practices that promoted high inter-rater reliability and inter-rater agreement between blinded raters prior to starting a pilot randomized controlled trial. The randomized controlled trial examined the preliminary effectiveness of occupation-based coaching via telehealth for rural families with children living with type 1 diabetes to improve family quality of life, participation, self-efficacy, and child health outcomes.

Method: We created a library of 13 occupation-based coaching videos portraying a range of evaluations, scores, and ratings. The inter-rater agreement and reliability on the occupation-based coaching Video Evaluation Tool were established through the iterations of (a) blinded rater training, (b) data collection using the tool, and (c) statistical analysis using Cohen's kappa and Cronbach's alpha.

Findings: Occurrence and Non-Occurrence Checklist (κ = 0.881, < 0.001); "Caregiver Talk" and "Interventionist Talk Analysis" (ICC = 0.991-0.999, < 0.001); Evidence of Independent Capacity Rating (ICC = 0.867 = 0.006).

Conclusion: Strong inter-rater reliability and inter-rater agreement was established by engaging two blinded raters through multifaceted training, integrating real-life clients and contexts into the instrumentation and training, and precisely defined rubric criteria. By employing such practices, high inter-rater reliability and agreement can be achieved in clinical research involving interventions and instruments that are highly subjective and individualized. To ascertain greater scientific confidence in the intervention effect, developing a multidomain fidelity framework and establishing high inter-rater agreement and reliability in the instruments a priori to implementation of clinical trials are necessary.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033844	PMC
http://dx.doi.org/10.1177/03080226241283292	DOI Listing

Publication Analysis

Top Keywords

inter-rater agreement

occupation-based coaching

inter-rater reliability

agreement reliability

reliability inter-rater

coaching video

video evaluation

evaluation tool

randomized controlled

controlled trial

Similar Publications

Implementing a Resource-Light and Low-Code Large Language Model System for Information Extraction from Mammography Reports: A Pilot Study.

J Imaging Inform Med

September 2025

Department of Diagnostic, Interventional and Pediatric Radiology (DIPR), Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland.

Fabio Dennstädt , Simon Fauser , Nikola Cihoric , Max Schmerder , Paolo Lombardo

Large language models (LLMs) have been successfully used for data extraction from free-text radiology reports. Most current studies were conducted with LLMs accessed via an application programming interface (API). We evaluated the feasibility of using open-source LLMs, deployed on limited local hardware resources for data extraction from free-text mammography reports, using a common data element (CDE)-based structure.

View Article and Find Full Text PDF

Similar Publications

Validation of a standardised approach to collect sociodemographic and social needs data in Canadian primary care: cross-sectional study of the SPARK tool.

BMJ Open

September 2025

Upstream Lab, MAP Centre for Urban Health Solutions, Li Ka Shing Knowledge Institute, Unity Health Toronto, Toronto, Ontario, Canada

Leanne Kosowan , Alan Katz , Dana Howse , Itunuoluwa Adekoya , Alannah Delahunty-Pike

Objective: This study validates the previously tested Screening for Poverty And Related social determinants to improve Knowledge of and access to resources ('SPARK Tool') against comparison questions from well-established national surveys (Post Survey Questionnaire (PSQ)) to inform the development of a standardised tool to collect patients' demographic and social needs data in healthcare.

Design: Cross-sectional study.

Setting: Pan-Canadian study of participants from four Canadian provinces (SK, MB, ON and NL).

View Article and Find Full Text PDF

Similar Publications

Validation of the Applicability and Standard Revision of the Canadian Agility and Movement Skill Assessment in Chinese Children Aged 8-12.

Percept Mot Skills

September 2025

College of Physical Education, Shandong Normal University, Jinan, China.

Xiaojin Mao , Yunjiao Yang , Han Xie , Botian Wang , Wenhao Li

This study aims to assess the applicability of the Canadian Agility and Movement Skill Assessment (CAMSA) in Chinese children aged 8-12 and to undertake preliminary revisions for areas found to be unsuitable. A randomized sample of 911 children aged 8-12 underwent testing. The results showed that difficulty coefficients for time scores among 8-9-year-olds were relatively low (.

View Article and Find Full Text PDF

Similar Publications

Evaluating the completeness of postoperative endoscopic recurrence assessment in Crohn's disease patients with Kono-S anastomoses.

Eur J Gastroenterol Hepatol

August 2025

Department of Gastroenterology and Hepatology, Monash Health.

Nikita Parkash , Charlotte Keung , Sally J Bell , Gregory T Moore

Background And Aims: Despite therapeutic advances, resection rates in Crohn's disease remain high. Kono-S is a novel anastomosis for ileocolonic resections; however, its altered configuration may challenge standard endoscopic assessment, particularly in the absence of validated scoring tools. This study evaluated the endoscopic assessment of Kono-S anastomosis anatomy and recurrence stratification using Rutgeert's score.

View Article and Find Full Text PDF

Similar Publications

Development and preliminary inter-rater reliability of the new PROOF tool to measure fidelity of problem-solving therapy for depression delivered by non-specialists in a low-resource African setting.

Glob Ment Health (Camb)

July 2025

Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.

Lily Cooke , Tarisai Bere , Amelia Stanton , Walter Mangezi , Steven A Safren

Problem-solving therapy (PST) is a brief psychological intervention often implemented for depression. Currently, there are no tools with well-evidenced reliability to measure PST fidelity. This pilot study aimed to measure the inter-rater reliability and agreement of the blem-Slving Therapy idelity (PROOF) scale, comprising binary 14-item adherence and an 8-item competence subscales.

View Article and Find Full Text PDF

Similar Publications