U.S. GRADE Network blog: May 2021

Friday, May 14, 2021

Reliability of Risk of Bias Assessments of Non-randomized Studies Improves After Customized Training

We previously reported on a paper published in 2020 assessing the inter-rater reliability (IRR) and inter-consensus reliability (ICR) of the Risk of Bias in Non-Randomized Studies of Interventions (ROBINS-I) tool, developed in 2016, and the Risk of Bias instrument for NRS of Exposures (ROB-NRSE) tool, developed in 2018. This paper found that reliability generally tended to be poor for these tools, while risk of bias assessments took evaluators, on average, 48 minutes for the ROBINS-I tool and almost 37 minutes for the ROB-NRSE.

Now, a new publication from the same group has examined the effect of training on the reliability of these tools. An international team of reviewers with a median of 5 years of experience with risk of bias assessment first applied the ROBINS-I and ROB-NRSE tools to a list of 44 non-randomized studies of interventions and exposures, respectively, using only the 53 pages of publicly available guidance. Then, the reviewers received an abridged and customized training document which was tailored specifically to the topic area of the reviews, included simplified guidance for assessing risk of bias, and also provided additional guidance related to more advanced concepts. The reviewers then re-assessed the studies' risk of bias after a several-weeks-long wash-out period.

Changes in the inter-rater reliability (IRR) for the ROBINS-I (top) and ROB-NRSE tools (bottom) from before and after a customized training intervention.

The training intervention improved the IRR of the ROBINS-I tool, generally improving the range of within-domain reliability while the reliability of the overall bias rating improved from "poor" to "fair." Meanwhile, the ICR improved substantially, with the overall rating's reliability improving from "poor" to "near perfect." Improvements were also observed after training in the application of the ROB-NRSE tool, with IRR of the overall bias improving significantly from "slight" to "near perfect" while its ICR improved from "poor" to "near perfect." For both tools, the pre-to-post-intervention correlations between reviewers' scores were poor, suggesting that the training did have an impact on these measures independent of a simple learning effect. While customized training was associated with a decrease in evaluator burden for the ROBINS-I tool, this did not hold true for the ROB-NRSE.

The findings of this analysis suggest that the use of a customized, shortened guidance tool specifically tailored to the topical content of a review, including simplified guidance for decision-making within each domain, can improve the reliability of resulting risk of bias assessments. The authors suggest that future reviewers create such guidance based on the specific needs and considerations of their topic area, and publish these tools along with the review.

Jeyaraman MM, Robson RC, Copstein L et al. (2021). Customized guidance/training improved the psychometric properties of methodologically rigorous risk of bias instruments for non-randomized studies. J Clin Epidemiol, in-press.

Manuscript available here.

Tuesday, May 4, 2021

Restricting Systematic Search to English-only is a Viable Shortcut in Most, but Perhaps Not All Topics in Medicine

In the limitations sections of systematic reviews on any topic, it is not uncommon for the authors to discuss how language limitations within their search may have restricted the breadth of evidence presented. For instance, if the reviewers speak only English, the review is likely limited to publications and journals in that language. But how much of a difference does such a limitation make in terms of the overall conclusions of a systematic review? According to a new paper in the Journal of Clinical Epidemiology, probably not much - but it may depend on the specific topic of medicine under investigation.

While other methods reviews have previously examined this question, Dobrescu and colleagues extended the range of topics to methods reviews that included systematic reviews within the realm of complementary and alternative medicine, yielding four reviews previously unexamined by prior studies. Specifically, the authors looked for methods reviews comparing the restriction of literature searches to English-only versus unrestricted searches and whose primary outcomes compared differences in treatment effect estimates, certainty of evidence ratings, or conclusions based on the language restrictions enforced.

The search yielded eight studies investigating the impact of language restrictions in anywhere from 9 to 147 systematic reviews in medicine. Overall, the exclusion of non-English articles had a greater impact on estimates of treatment effects and the statistical significance of findings in reviews of complementary and alternative medicine versus conventional medicine topics. Most commonly, the exclusion of non-English studies led to a loss of statistical significance in these topic areas.

Overall, the methods studies examined found that the exclusion of non-English studies of conventional medicine topics led to small to moderate changes in the estimate of effect; however, exclusion of non-English studies shrank the observed effect size in complementary and alternative medicine topics by 63 percent. Two studies examined whether language restricted influenced authors' overall conclusions, generally finding no effect.

The figure above shows the frequency of languages of the excluded reviews examined.

The authors conclude that when it comes to systematic reviews of conventional medicine topics, their findings are in line with those of previous methods studies which demonstrate little to no effect of language restrictions and suggest that restricting a search to English-only should not greatly impact the findings or conclusions of a review. However, the effect appears greater in the realm of complementary and alternative medicine, perhaps due to the greater proportion of non-English studies published in this field. Thus, systematic reviewers attempting to synthesize the evidence on an alternative medicine topic should be cognizant of their choices regarding language restriction and the potential implications they may have on their ultimate findings.

Dobrescu A, Nussbaumer SB, Klerings I et al. (2021). Restricting evidence syntheses of interventions to English-language publications is a viable methodological shortcut for most medical topics: A systematic review: Excluding English-language publications a valid shortcut. J Clin Epidemiol, epub ahead of print.

Manuscript available from publisher's website here.