Thursday, July 29, 2021

New GRADE guidance on assessing imprecision in a network meta-analysis

Imprecision is one of the major domains of the GRADE framework and is used to assess whether to rate down the certainty of evidence related to an outcome of interest. In a traditional ("pairwise") meta-analysis which compares two intervention groups, exposures, or tests against one another, two considerations are made: the confidence interval around the absolute estimate of effect, and the optimal information size (OIS). If the bounds of the confidence interval cross a threshold for a meaningful effect, and/or if optimal information size given the sample size in the meta-analysis is not met, then one should consider rating down for imprecision.

In the context of small sample sizes, confidence intervals around an effect may be fragile - meaning they could be changed substantially with additional information. Therefore, the consideration of OIS along with the bounds of the confidence interval helps address this concern when rating the certainty of evidence to develop a clinical recommendation. This is typically done by assessing whether the sample size of the meta-analysis meets that determined by a traditional power analysis for a given effect size.

However, in a network meta-analysis, both direct and indirect comparisons are made across various interventions or tests. Thus, especially if the inclusion of indirect comparisons changes the overall estimate of effect, considering only the sample size involved in the direct comparisons would be misleading. 


A new GRADE guidance paper lays out how to assess imprecision in the context of a network meta-analysis:

  • If the 95% confidence interval crosses a decision-making threshold, rate down for imprecision. Thresholds should be ideally set a priori. It may be considered to rate down by two or even three levels depending on the degree of imprecision and the resulting communication of the certainty of evidence. For example, if imprecision is the only concern for an outcome, rating down by two instead of one level would be the difference between saying that a certain intervention or test "likely" or "probably" increases or decreases a given outcome, versus whether it simply "may" have this effect.
  • If the 95% confidence interval does not cross a decision-making threshold, consider whether the effect size may be inflated. If a point estimate is far away enough from a threshold, even a relatively wide CI may not cross it. Further, relatively large effect sizes from smaller pools of evidence can be reduced with future research. 
    • In the case of a large effect size, consider whether OIS is met. If the number of patients contributing to a NMA does not meet this number, consider rating down by one, two, or three levels depending on the severity of the width of the CI. 
    • If the upper-limit of a confidence interval using relative risk is 3 or more times higher than the lower-limit, OIS has likely not been met. Similarly, upper-to-lower-limit comparisons of odds ratios exceeding 2.5 have likely not met OIS.
  • Alternatively, when the effect size is both modest, plausible, and does not cross a threshold, one likely does not need to rate down for imprecision. 
  • Avoid "double dinging" for imprecision if this limitation has already been addressed by rating down elsewhere.

Brignardello-Peterson R, Guyatt GH, Mustafa RA, et al. (2021). GRADE guidelines 33. Addressing imprecision in a network meta-analysis. J Clin Epidemiol (in-press). 

Manuscript available at the publisher's website here.