U.S. GRADE Network blog: Reliability of Risk of Bias Assessments of Non-randomized Studies Improves After Customized Training

We previously reported on a paper published in 2020 assessing the inter-rater reliability (IRR) and inter-consensus reliability (ICR) of the Risk of Bias in Non-Randomized Studies of Interventions (ROBINS-I) tool, developed in 2016, and the Risk of Bias instrument for NRS of Exposures (ROB-NRSE) tool, developed in 2018. This paper found that reliability generally tended to be poor for these tools, while risk of bias assessments took evaluators, on average, 48 minutes for the ROBINS-I tool and almost 37 minutes for the ROB-NRSE.

Now, a new publication from the same group has examined the effect of training on the reliability of these tools. An international team of reviewers with a median of 5 years of experience with risk of bias assessment first applied the ROBINS-I and ROB-NRSE tools to a list of 44 non-randomized studies of interventions and exposures, respectively, using only the 53 pages of publicly available guidance. Then, the reviewers received an abridged and customized training document which was tailored specifically to the topic area of the reviews, included simplified guidance for assessing risk of bias, and also provided additional guidance related to more advanced concepts. The reviewers then re-assessed the studies' risk of bias after a several-weeks-long wash-out period.

Changes in the inter-rater reliability (IRR) for the ROBINS-I (top) and ROB-NRSE tools (bottom) from before and after a customized training intervention.

The training intervention improved the IRR of the ROBINS-I tool, generally improving the range of within-domain reliability while the reliability of the overall bias rating improved from "poor" to "fair." Meanwhile, the ICR improved substantially, with the overall rating's reliability improving from "poor" to "near perfect." Improvements were also observed after training in the application of the ROB-NRSE tool, with IRR of the overall bias improving significantly from "slight" to "near perfect" while its ICR improved from "poor" to "near perfect." For both tools, the pre-to-post-intervention correlations between reviewers' scores were poor, suggesting that the training did have an impact on these measures independent of a simple learning effect. While customized training was associated with a decrease in evaluator burden for the ROBINS-I tool, this did not hold true for the ROB-NRSE.

The findings of this analysis suggest that the use of a customized, shortened guidance tool specifically tailored to the topical content of a review, including simplified guidance for decision-making within each domain, can improve the reliability of resulting risk of bias assessments. The authors suggest that future reviewers create such guidance based on the specific needs and considerations of their topic area, and publish these tools along with the review.

Jeyaraman MM, Robson RC, Copstein L et al. (2021). Customized guidance/training improved the psychometric properties of methodologically rigorous risk of bias instruments for non-randomized studies. J Clin Epidemiol, in-press.

Manuscript available here.

Friday, May 14, 2021

Reliability of Risk of Bias Assessments of Non-randomized Studies Improves After Customized Training

Blog Archive