Wednesday, February 19, 2020

Research Shorts: Informative statements to communicate the findings of reviews

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

When authors of systematic reviews utilize the GRADE approach to evaluate the certainty of evidence in their findings, they should present this information in a way that is clear, consistent, and useful to the reader. In a recent article from the GRADE series (GRADE guidelines 26) in the Journal of Clinical Epidemiology, Santesso and colleagues present recommendations for communicating the effect size and certainty of evidence within a systematic review. These statements were informed by years of research, feedback, and discussion, including the qualitative input of around 100 methodology experts and a survey of 110 respondents of diverse backgrounds and levels of GRADE expertise.

The final result was a table of suggested statements organized by the certainty of the effect followed by the size of that effect based on the point estimate. In order to use this tool, systematic review authors will need to first determine thresholds for the size of the effect (i.e., whether the effect on an outcome is trivial, small, moderate, or large, or if there is no effect). This can be accomplished in “full contextualization,” in which the outcome is considered in relation to all other critical outcomes, or “partial contextualization,” in relation to the standalone value of the single outcome.

The suggested statements generated from the table can be used throughout the text of a systematic review, from the abstract to the discussion, and as part of any review type, such as those examining the accuracy of test strategies. The included language is also simple enough to be included as part of a plain language summary or other consumer-facing materials.


Santesso N, Glenton C, Dahm P, Garner P, Akl E, Alper B, Brignardello-Petersen R, Carrasco-Labra A, De Beer H, Hultcrantz M, Kuijpers T Meerpohl J, Morgan R, Mustafa R, Skoetz N, Sultan S, Wiysonge C, Guyatt G, Schünemann HJ. GRADE guidelines 26: Informative statements to communicate the findings of systematic reviews of interventions. Journal of clinical epidemiology. 2019 Nov 9.

Manuscript available here on publisher's site.

Tuesday, February 11, 2020

Don’t Sell Your Guideline Short – Remember to Report! (Part 1)

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The development of a high-quality, evidence-based clinical guideline is no small feat. It requires significant time and effort from content experts, methodologists, and organizational staff and typically takes more than 1-2 years from start to finish.

Given the effort and hours that go into guideline development, it’s all too easy - and all too common - for the reporting of the development process of these guidelines to significantly undersell their quality. This is important, because published analyses assessing the quality of guidelines will likely only use what is reported or referenced in the text of the guideline. In other words, guidelines that do not adequately report on the methods they used to develop their recommendations will be under-appraised in the published literature – and this could lead to a gross underestimation of a guideline-developing organization’s work as a whole.

Quality and Reporting Standards: A Brief Review

Over the past decade, a number of standard sets, reporting checklists, and appraisal tools have been published to assist guideline developers in the reporting of their methods and to provide ways for researchers to assess the quality of these guidelines. These standards and methods of appraisal include but are not limited to:
  • The Appraisal of Guidelines for Research and Evaluation (AGREE) II tool (2010)
  • the National Academy of Medicine (formerly the Institute of Medicine [IOM]) Standards for Trustworthy Clinical Practice Guidelines (2011)
  • the Guideline International Network (G-I-N) Key Components of High-Quality and Trustworthy Guidelines (2012)
  • World Health Organization (WHO) Handbook for Guideline Development (2nd ed., 2014)
  • Reporting Items for practice Guidelines in HealThcare (RIGHT) Statement (2017)


Report, or it didn’t happen.

A guideline may be developed using the most water-tight, rigorous methods, but if these methods are not adequately described either in the text of the guideline or in a referenced external text, then an assessor will likely under-appraise the quality of a guideline. To ensure the most accurate appraisal of a guideline possible, guideline developers should consider the following helpful tips:
  • Create a guideline template including boilerplate text that meets as much reporting criteria as possible, such as a general description of the systematic review and recommendations development processes; competing interest statements for all involved authors and guideline panel members; a description of the method used to assess certainty of evidence and grade the strength of recommendations; and a clear table at the beginning of the document listing all clinical questions and resulting recommendations.
  • Maintain an up-to-date, in-depth description of the guideline development process on the website of the guideline-producing organization. Refer to this page specifically in the text of the guideline. This allows both guideline end-users and potential assessors to view the development process in depth without requiring too much space in the guideline document itself. 
  • When in doubt, refer it out. If there are supplemental texts to the guideline that include information related to the development process – such as an underlying systematic review or a list of authors’ conflict of interest disclosures – make sure these documents are clearly referenced in the guideline text and made easily accessible in the online version via hyperlinks. 
  • Don’t make assumptions. Even aspects of the development process that seem obvious, such as whether the guideline is externally reviewed, will likely not be included in a published quality assessment if it is not explicitly mentioned. 
  • Always be specific. Do not make the end-user of a guideline have to guess who the guideline is for, the clinical questions driving the guideline, or the appropriate scenarios in which to employ the recommendations. Utilizing the PICO (Population, Intervention, Comparison, Outcome) format to explicitly describe the clinical questions and resulting recommendations is a failsafe way to ensure your guideline is specific enough to be useful. 


Stay tuned for Part II where we provide a list of commonly overlooked items in published guidelines and discuss how to instantly improve the quality assessment of a guideline.

Monday, February 3, 2020

Research Shorts: From test accuracy to patient-important outcomes and recommendations

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The potential risks and benefits of a screening or diagnostic testing strategy extend beyond the immediate impact and accuracy of the test itself. The result of testing will determine the available next steps and options for follow-up and management, and therefore will affect various patient-important outcomes in addition to potential resource utilization and equity considerations. These downstream consequences, and the certainty of evidence in these consequences, need to be considered when formulating recommendations surrounding testing. In a July 2019 paper published as part 22 of the Journal of Clinical Epidemiology’s GRADE guidelines series, Schünemann and colleagues provide suggestions for assessing certainty of evidence and determining recommendations for diagnostic tests and strategies.

While a collection of randomized controlled trial evidence examining the downstream consequences of various testing strategies is ideal in this scenario, such data are sparse. In lieu of this, guideline authors should develop a framework that includes each possible testing and follow-up treatment scenario, starting with the test in question and ending with patient-important outcomes.


 H.J. Schunemann et al. / Journal of Clinical Epidemiology 111 (2019) 69e82

As seen in this USPSTF sample framework, evidence begins with accuracy studies and ends with patient-important end-points.

This will allow the panel to visually link all relevant existing data together and develop clinical questions that are answerable with the evidence at hand. Data on the accuracy of a given test will help inform the expected number of false negatives and positives, which would then lead to potentially important downstream consequences - such as anxiety or a missed diagnosis - in addition to the effects of treating a diagnosed condition. The estimates of these beneficial and harmful potential outcomes should ideally come from a systematic review of evidence which can then be assessed for certainty. 

H.J. Schunemann et al. / Journal of Clinical Epidemiology 111 (2019) 69e82

The authors suggest providing one overall rating of the quality of evidence that takes into account the certainty of the diagnostic, prognostic, and management data that are available. Guideline panels should determine which outcomes of these bodies of evidence are critical and ascribe an overall rating based on the lowest level of certainty of the critical outcomes. 


Schünemann HJ, Mustafa RA, Brozek J, Santesso N, Bossuyt PM, Steingart KR, Leeflang M, Lange S, Trenti T, Langendam M, Scholten R. GRADE guidelines: 22. The GRADE approach for tests and strategies—from test accuracy to patient-important outcomes and recommendations. Journal of clinical epidemiology. 2019 Jul 1;111:69-82.

Manuscript available here on publisher's site.

Wednesday, January 22, 2020

Research Shorts: Rating the certainty in evidence in the absence of a single estimate of effect

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

When a pooled estimate from a meta-analysis of several studies is not present to guide the rating of evidence in these domains, how should one make a final determination of the certainty of evidence using GRADE? 


Evidence from a 30,000-foot view

In their 2017 paper published in Evidence-Based Medicine, Murad and colleagues describe methods for applying GRADE when bodies of evidence are either sparse or too disparate to pool. A systematic review, for instance, may only provide a narrative synthesis of the current evidence given these limitations. When a neat estimate of effect presented as part of a tidy forest plot is not available, it is necessary to use one’s best judgment to rate the domains by taking a broader view. In these cases, Murad et al. recommend the following approach:
  • Risk of Bias: Judge the risk of bias across all studies that include the outcome of interest.
  • Inconsistency: Consider the direction and size of the estimates of effect from each study. Generally, do they all tell the same story, or do they vary considerably?
  • Indirectness: Make an overall judgment about the amount of directness or indirectness of the body of evidence, given your specific question (always consider your population, intervention, outcome, and comparator[s] of interest). Generally, are the studies synthesized answering questions similar to yours? Or might the dissimilarities be enough to lower your trust in the estimate of effect as it pertains to your question?
  • Imprecision: Examine the total information size of all studies (number of events for binary outcomes, or number of participants for continuous outcomes) as well as each study’s reported confidence interval for this outcome. If there are fewer than 400 total events or participants, or if the confidence intervals from most studies - or the largest - include no effect, imprecision is likely present.
  • Publication bias: Suspect publication bias if there is a small number of only positive studies, or if data were reported in trial registries but never published.
As always, one may consider rating up the quality of evidence from an observational study if a large magnitude of effect, a dose-response gradient, or plausible residual confounding that would increase the certainty of effect are present in the majority of studies examined.


Murad MH, Mustafa RA, Schünemann HJ, Sultan S, Santesso N. Rating the certainty in evidence in the absence of a single estimate of effect. BMJ Evidence-Based Medicine. 2017 Jun 1;22(3):85-7.

Manuscript available here on publisher's site.

Monday, January 20, 2020

Research Shorts: Assessing the certainty of evidence in the importance of outcomes or values and preferences

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The rating of outcomes in terms of their importance is a key aspect of GRADE guideline development. So is, of course, the rating of the certainty of evidence that will inform clinical decision-making. However, it is often difficult to rate the certainty of evidence of the importance of outcomes – assuming there is any evidence to draw from at all. In their July 2019 article published in the Journal of Clinical Epidemiology, Zhang and colleagues describe the ways to assess the certainty of a body of evidence used to determine the relative importance of outcomes.



The GRADE domains that present the most challenges when rating the certainty of evidence are inconsistency and imprecision. Assuming there is more than one study, assessment of inconsistency should include judging the amount of variance across studies’ reported importance of outcomes, exploring potential sources for this inconsistency (such as differences in populations or instruments used) and rating down when inconsistency is not explained by these. Imprecision should take into consideration the sample size first. In fact, in cases where there is no available quantitative synthesis, sample size may be the only consideration. In other cases, assuming information size meets a pre-defined threshold, the evidence may still be rated down if the confidence intervals of relative importance outcomes cross a pre-defined decision-making threshold.


Y. Zhang et al. (2019)/Journal of Clinical Epidemiology

The authors warn against attempts to rate the certainty of evidence in the variability of outcome importance – in other words, how much the perceived importance of any outcome varies from one individual to the next. If both inconsistency and imprecision are ruled out as potential sources of observed variance, then true variability may exist. In these cases, guideline panels should consider the formation of a conditional recommendation based on differences in values and preferences.

The article also provides guidance for assessing publication bias and rating up.


Zhang Y, Coello PA, Guyatt GH, Yepes-Nuñez JJ, Akl EA, Hazlewood G, Pardo-Hernandez H, Etxeandia-Ikobaltzeta I, Qaseem A, Williams Jr JW, Tugwell P. GRADE guidelines: 20. Assessing the certainty of evidence in the importance of outcomes or values and preferences—inconsistency, imprecision, and other domains. Journal of clinical epidemiology. 2019 Jul 1;111:83-93.

Manuscript available here on publisher's site.