Tuesday, March 10, 2020

Research Shorts: U.S. Guideline Developers Inconsistently Applying Criteria for Appropriate Evidence Grading

Contributed by Philipp Dahm, MD, MHSc, FACS

Guideline Developers in the United States were Inconsistent in Applying Criteria for Appropriate GRADE Use


Our study was motivated by the anecdotal observation that many US-based organizations appeared to be endorsing the GRADE approach but did not necessarily apply it to the fullest extent. We therefore sought to formally study this issue applying six published criteria of appropriate GRADE use. We limited to search to guidelines from US-based organizations that were included in the National Guideline Clearinghouse (NGC) which implied that they met certain, minimal criteria for evidence-based guidelines. Our search reached back to January 2011 and went to June 2018 after which time the NGCH lost its funding and stopped existing in that form.

Among guidelines documents from 315 organizations included in the database, 135 were from the US and were represented by at least one guideline. Our analysis ultimately included 67 guideline documents from 44 organizations. The vast majority of these guidelines were from professional organizations; mostly related to the field of internal medicine and its subspecialties. With regard to domains for rating the certainty of evidence, only one in 10 was explicit about including all five criteria for downgrading (study limitations, indirectness, inconsistency, imprecision, and publication bias) for a body of evidence from randomized trials and all three domains (large magnitude of effect, dose-response gradient, and direction of residual bias) for rating up a body of evidence from non-randomized trials. Over half of guidelines described explicit consideration of all four central domains (certainty of evidence, balance of benefits to harms, patients’ values and preferences and resource utilization) for moving from evidence to recommendations. All guidelines included the certainty of evidence and the vast majority also addressed the balance of desirable and undesirable consequences. When comparing guidelines published in 2011-2014 versus 2015-18, rates of appropriate use were higher for nearly all criteria, but only one main criterion met statistical significance, namely the reporting of evidence summaries supporting recommendations.

The take-home messages from this study are that one-in-three US based organizations developing evidence-based guidelines report the use of GRADE but that adherence to published criteria is quite inconsistent. As GRADE finds increasing uptake worldwide, continued efforts in training guideline methodologists and panel members will be important to assure appropriate application of GRADE methodology.


Dixon C, Dixon PE, Sultan S, Mustafa R, Morgan RL, Murad MH, Falck-Ytter Y, Dahm P. Guideline Developers in the United States were Inconsistent in Applying Criteria for Appropriate GRADE Use. Journal of Clinical Epidemiology. 2020 Mar 4.