98%
921
2 minutes
20
Introduction: Establishing inter-rater agreement and reliability ascertains that multiple raters consistently evaluate observed interventions to ensure that clinical research protocols are delivered as intended by the trial protocol.
Purpose: Using the Guidelines for Reporting Reliability and Agreement Studies, we (a) exemplified the steps to establish inter-rater reliability and inter-rater agreement on the occupation-based coaching Video Evaluation Tool and (b) evaluated best practices that promoted high inter-rater reliability and inter-rater agreement between blinded raters prior to starting a pilot randomized controlled trial. The randomized controlled trial examined the preliminary effectiveness of occupation-based coaching via telehealth for rural families with children living with type 1 diabetes to improve family quality of life, participation, self-efficacy, and child health outcomes.
Method: We created a library of 13 occupation-based coaching videos portraying a range of evaluations, scores, and ratings. The inter-rater agreement and reliability on the occupation-based coaching Video Evaluation Tool were established through the iterations of (a) blinded rater training, (b) data collection using the tool, and (c) statistical analysis using Cohen's kappa and Cronbach's alpha.
Findings: Occurrence and Non-Occurrence Checklist (κ = 0.881, < 0.001); "Caregiver Talk" and "Interventionist Talk Analysis" (ICC = 0.991-0.999, < 0.001); Evidence of Independent Capacity Rating (ICC = 0.867 = 0.006).
Conclusion: Strong inter-rater reliability and inter-rater agreement was established by engaging two blinded raters through multifaceted training, integrating real-life clients and contexts into the instrumentation and training, and precisely defined rubric criteria. By employing such practices, high inter-rater reliability and agreement can be achieved in clinical research involving interventions and instruments that are highly subjective and individualized. To ascertain greater scientific confidence in the intervention effect, developing a multidomain fidelity framework and establishing high inter-rater agreement and reliability in the instruments a priori to implementation of clinical trials are necessary.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033844 | PMC |
http://dx.doi.org/10.1177/03080226241283292 | DOI Listing |
J Imaging Inform Med
September 2025
Department of Diagnostic, Interventional and Pediatric Radiology (DIPR), Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland.
Large language models (LLMs) have been successfully used for data extraction from free-text radiology reports. Most current studies were conducted with LLMs accessed via an application programming interface (API). We evaluated the feasibility of using open-source LLMs, deployed on limited local hardware resources for data extraction from free-text mammography reports, using a common data element (CDE)-based structure.
View Article and Find Full Text PDFBMJ Open
September 2025
Upstream Lab, MAP Centre for Urban Health Solutions, Li Ka Shing Knowledge Institute, Unity Health Toronto, Toronto, Ontario, Canada
Objective: This study validates the previously tested Screening for Poverty And Related social determinants to improve Knowledge of and access to resources ('SPARK Tool') against comparison questions from well-established national surveys (Post Survey Questionnaire (PSQ)) to inform the development of a standardised tool to collect patients' demographic and social needs data in healthcare.
Design: Cross-sectional study.
Setting: Pan-Canadian study of participants from four Canadian provinces (SK, MB, ON and NL).
Percept Mot Skills
September 2025
College of Physical Education, Shandong Normal University, Jinan, China.
This study aims to assess the applicability of the Canadian Agility and Movement Skill Assessment (CAMSA) in Chinese children aged 8-12 and to undertake preliminary revisions for areas found to be unsuitable. A randomized sample of 911 children aged 8-12 underwent testing. The results showed that difficulty coefficients for time scores among 8-9-year-olds were relatively low (.
View Article and Find Full Text PDFEur J Gastroenterol Hepatol
August 2025
Department of Gastroenterology and Hepatology, Monash Health.
Background And Aims: Despite therapeutic advances, resection rates in Crohn's disease remain high. Kono-S is a novel anastomosis for ileocolonic resections; however, its altered configuration may challenge standard endoscopic assessment, particularly in the absence of validated scoring tools. This study evaluated the endoscopic assessment of Kono-S anastomosis anatomy and recurrence stratification using Rutgeert's score.
View Article and Find Full Text PDFGlob Ment Health (Camb)
July 2025
Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
Problem-solving therapy (PST) is a brief psychological intervention often implemented for depression. Currently, there are no tools with well-evidenced reliability to measure PST fidelity. This pilot study aimed to measure the inter-rater reliability and agreement of the blem-Slving Therapy idelity (PROOF) scale, comprising binary 14-item adherence and an 8-item competence subscales.
View Article and Find Full Text PDF