U.S. GRADE Network blog: Research

Showing posts with label Research. Show all posts

Wednesday, July 3, 2024

EF Scholar Success Story: Nirjhar Ruth Ghosh

New publication alert!

Nirjhar Ruth Ghosh, who attended the winter 2022 Systematic Review workshop as an Evidence Foundation scholar, recently published the results of her hard work in The Journal of Nutrition! The project, "Evidence-Based Practice Competencies Among Nutrition Professionals and Students: A Systematic Review" was originally presented by Ghosh to her fellow attendees of the virtual workshop. The results of the systematic review are now able to be read in detail at this link.

In an accompanying editorial lauding the publication, Francene M. Steinberg wrote that "Ghosh et al. have provided a foundation for further consideration of steps to advance interprofessional competencies in EBP to optimize clinical nutrition decision making and patient care outcomes. Improved clarity about core competencies, innovative EBP curriculum and pedagogic approaches, and more rigorous research evaluations of EBP application and outcomes are all necessary components of the path forward."

Congratulations, Ruth!

Interested in being our next scholar success story? Applications for scholarships to attend our fully virtual GRADE Guideline Development Workshop are now open (deadline: August 31). Learn more at https://evidencefoundation.org/scholarships.

Friday, April 2, 2021

New Review of Pragmatic Trials Reveals Insights, Identifies Gaps

As opposed to an "explanatory" or "mechanistic" randomized controlled trial (RCT), which seeks to examine the effect of an intervention under tightly controlled circumstances, "pragmatic" or "naturalistic" trials study interventions and their outcomes when used in more real-world, generalizable settings. One example of such a study might include the use of registry data to examine interventions and outcomes as they occur in the "real world" of patient care. However, there are currently few standards for identifying, reporting, and discussing the results of such "pragmatic RCTs." A new paper by Nicholls and colleagues aims to provide an overview of the current landscape of this methodological genre.

The authors searched for and synthesized 4,337 trials using keywords such as "pragmatic," "real world," "registry based," and "comparative effectiveness" to better map an understanding of how pragmatic trials are presented in the RCT literature. Overall, only about 22% (964) of these trials were identified as "pragmatic" RCTs in the title, abstract, or full text; about half of these (55%) used this term in the title or abstract, while the remaining 45% described the work as a pragmatic trial only in the full text.

About 78.1% (3,368) of the trials indicated that they were registered. However, only about 6% were indexed in PubMed as a pragmatic trial, and only 0.5% were labeled with the MeSH topic of Pragmatic Clinical Trial. The target enrollment of pragmatic trials was a median of 440 participants within an interquartile range (IQR) of 244 to 1,200; the actual achieved accrual was 414 (IQR: 216 - 1,147). The largest trial included 933,789 participants; the smallest enrolled 60.

Overall, pragmatic trials were more likely to be centered in North America and Europe and to be funded by non-industry sources. Behavioral, rather than drug or device-based, interventions were most common in these trials. Not infrequently, the trials were mislabeled or contained erroneous data in their registration information. The fact that only about half of the sample were clearly labeled as "pragmatic" may mean that these trials may go undetected with less sensitive search mechanisms than the authors used.

Authors of pragmatic trials can improve the quality of the field by clearly labelling their work as such and by registering their trials and ensuring that registered data are accurate and up-to-date. The authors also suggest that taking a broader view of what constitutes a "pragmatic RCT" also generates questions regarding proper ethical standards when research is conducted on a large scale with multiple lines of responsibility. Finally, the mechanisms used to obtain consent in these trials should be further examined in light of the finding that many pragmatic trials fail to achieve goals set for participant enrollment.

Manuscript available from publisher's web site here.

Nicholls SG, Carroll K, Hey SP, et al. (2021). A review of pragmatic trials found a high degree of diversity in design and scope, deficiencies in reporting and trial registry data, and poor indexing. J Clin Epidemiol (ahead of print).

Monday, September 14, 2020

Timing and Nature of Financial Conflicts of Interest Often Go Unreported, Systematic Survey Finds

The proper disclosure and management of financial Conflicts of Interest (FCOI) within the context of a published randomized controlled trial is vital to alerting the reader to the sources of funding for the research and other financial factors that may influence the design, conduct, or reporting of the trial.

A recently published cross-sectional survey by Hakoum and colleagues examined the nature of FCOI reporting in a sample of 108 published trials found that 99% of these reported individual author disclosures, while only 6% reported potential sources of FCOI at the institutional level. Individual authors reported a median of 2 FCOIs. Among the 2,972 FCOIs reported by 806 individuals, the greatest proportion came from personal fees other than employment income (50%) and from grants (34%). Further, of those disclosing individual FCOI, a large majority (85%) were provided by private-for-profit entities. Notably, only one-third (33%) of these disclosures included the timing of the funding in relation to the trial, 17% reported the relationship between the funding source and the trial, and just 1% reported the monetary value.

Click to enlarge.

Using a multivariate regression, the authors found that the reporting of FCOI by individual authors was positively associated with nine factors, most strongly with the authors being from an academic institution (OR: 2.981; 95% CI: 2.415 – 3.680), with the funding coming from an entity other than private-for-profit (OR: 2.809; 95% CI: 2.274 – 3.470), and the first author’s affiliation being from a low- or middle-income country (OR: 2.215; 95% CI: 1.512 – 3.246).

More explicit and complete reporting of FCOIs, the authors conclude, may improve readers’ level of trust in the results of a published trial and in the authors presenting them. To improve the nature and transparency of FCOI reporting, researchers may consider disclosing details related to the funding’s source, including the timing of the funding in relation to the conduct and publication of the trial, the relationship between the funding source and the trial, and the monetary value of the support.

Hakoum, M.B., Noureldine, H., Habib, J.R., Abou-Jaoude, E.A., Raslan, R., Jouni, H., ... & Akl, E.A. (2020). Authors of clinical trials seldom reported details when declaring their individual and institutional financial conflicts of interest: A cross-sectional survey. J Clin Epidemiol 127:49-58.

Manuscript available from the publisher's website here.

Tuesday, September 8, 2020

Assessing Health-Related Quality of Life Improvement in the Modern Anticancer Therapy Era

Recent breakthroughs in anticancer therapies such as small-molecule drugs and immunotherapies have made improvements in Health-Related Quality of Life (HRQOL) possible among cancer patients over the course of treatment. In a recent paper published in the Journal of Clinical Epidemiology, Cottone and colleagues are the first to propose the framework for assessing the change in HRQOL over time in these patients: Time to HRQOL Improvement (TTI), and Time to Sustained HRQOL Improvement (TTSI).

In the proposed framework, TTI is based on the time to the “first clinically meaningful improvement occurring in a given scale or in at least one among different scales” – for instance, a minimal important difference (MID) of 5 points on the European Organization for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire – Core 30 (QLQ-C30). The authors suggest utilizing the first posttreatment score as the baseline measurement for monitoring improvements over time. “Sustained improvement” was defined as the first improvement that is not followed by a deterioration that meets or exceeds the MID.

The use of Kaplan-Meier curves and Cox proportional hazards is inappropriate for these outcomes, the authors argue, as it does not allow for possible competing events, such as disease progression, toxicity, or the possibility of an earlier improvement in another scale when multiple scales are used. They propose the use of the Fine-Gray model for the evaluation of TTI and TTSI and pilot it with a case study of 124 newly diagnosed chronic myeloid leukemia patients undergoing first-line treatment with nilotinib.

Time To Improvement (TTI) and Time to Sustained Improvement (TTSI) can be used to elucidate differences in HRQOL responses to treatment based on baseline characteristics. Here, the figure shows TTSI in fatigue scores based on hemoglobin level at baseline. Click to enlarge.

Using this model, the authors found that improvements in fatigue scores appeared more quickly than those in physical functioning when measuring scores from baseline (pre-treatment), but upon using first post-treatment score as the baseline, the differences between improvement rates in fatigue and physical functioning diminished. Additionally, a lower baseline hemoglobin level was associated with earlier sustained improvements in fatigue.

While the proposed method of evaluating TTI and TTSI has some limitations, such as lower statistical power than other ways of tracking changes in HRQOL over time, it also has notable strengths. In particular, this method can be used to elucidate differences between treatment approaches that show similar survival outcomes so that the approach with shorter TTI and TTSI can be favored.

Cottone, F., Collins, G.S., Anota, A., Sommer, K., Giesinger, J.M., Kieffer, J.M., ... & Efficace, F. (2020). Time to health-related quality of life improvement analysis was developed to enhance evaluation of modern anticancer therapies. J Clin Epidemiol 127:9-18.

Manuscript available from publisher's website here.

Thursday, August 6, 2020

New Systematic Review Suggests Noncordance with COI Disclosure to Reporting Databases is Widespread, but Methodological Quality of Studies is Variable

Disclosure of conflict of interest (COI) is a major point of concern in the development of guidelines as well as original research papers. Over the years, multiple studies have aimed to elucidate just how closely the disclosures of individual authors tracks with their reported COI in open databases. A new systematic review of 27 such studies, recently published online in the Journal of Clinical Epidemiology, compiles the findings of these studies into some eyebrow-raising statistics while also taking a look at the methodological quality of these studies.

In their review, El-Rayass and colleagues found that although the methodological quality for assessing the concordance of authors’ COI disclosures within papers and according to public databases varied widely, a median of 81.2% of authors across 20 studies had “noncorcordant” disclosures, (ranging from 41.8% to 98.6% across all studies) and that more than half (43.4% of all authors) of these were “completely nonconcordant” (ranging from 15% to 89.5% across all studies). What’s more, among seven studies that analyzed company reporting on the individual level, between 23.1% and 85.4% of companies did not report their payments to authors.

Click to enlarge.

For the five studies that analyzed disclosures on the study rather than the individual author level, all found at least some degree of discordance between in-study disclosures and database reports. The rate of nonconcordant disclosures among these studies ranged from 6 to 92.6%

The authors note that ulterior motives of authors are just one potential explanation for the high observed rate of nonconcordant COI disclosure and reporting. Vague instructions and parameters set by journals during the article submission process may undermine efforts to transparently report any and all potential sources of conflict, be they financial, intellectual or otherwise. In addition, the authors found that studies of COI reporting that tended to have higher methodological quality also tended to report lower estimates of nonconcordance, meaning that the overall combined estimates may be artificially inflated – for instance, due to some studies not making a distinction about the relevancy of potential COI sources to the topic of the articles analyzed. The authors note potential sources of nondirectional error as well, such as how differences in COI categories between in-paper disclosures and reference databases were handled, which additionally lowers confidence in the current estimate.

Click to enlarge.

In sum, the recent review by El-Rayess et al. points out that issues with concordance between authors’ COI disclosures in their published works seem to be at odds with publicly available reports of these relationships; however, the degree of nonconcordance overall is still uncertain. Those looking to complete future analyses of COI disclosure policies may want to use this paper as a roadmap to improving our certainty in the actual magnitude of the issue.

El-Rayess, H., Khamis, A.M., Haddad, S., Ghaddara, H.A., Hakoum, M., Ichkhanian, Y., Bejjani, M., and Akl, E.A. Assessing concordance of financial conflicts of interest disclosures with payments' databases: A systematic survey of the health literature. J Clin Epidemiol 127:19-28.

Manuscript available at the publisher's website here.

Friday, May 1, 2020

Grey Matters: An Introduction to the Grey Literature and Where to Find It

Within the methods section of many a systematic review, it is common to come across the term "grey literature." Put plainly, grey literature comprises pieces of evidence that are not formally published in a book or peer-reviewed journal article.

Examples of grey literature that can be valuable to a systematic review include:

conference abstracts and proceedings
clinical study reports
dissertations and theses
journal preprints

Searching the "grey lit" has several important benefits:

It expands the reach of a systematic review beyond the scope of the databases mined by a search, increasing the chance of finding pieces of evidence that may be helpful to the final synthesis of data.
It helps reduce the impact of potential publication bias on the findings of a review.
It keeps the review current by including upcoming data from recent conferences, doctoral work, and other yet-to-be-published sources.

Ideally, a search of the grey literature should be used in tandem with other forms of hand-searching, including the searching of relevant citations within included articles and of well-known reviews on similar topic.

Where to Find Grey Literature

Below are some resources that list helpful links for exploring the grey literature:

Friday, April 10, 2020

Rapid Guidelines in GRADE Pt. II: Rapid Recs in the Real World

In Part I of our series on rapid guidelines, we discussed the utility and terminology of rapid recommendations: those recommendations made in response to an urgent public health issue with timeframes ranging from a few short hours up to three months' time.

Who develops rapid guidelines?

In the first of a three-part series on rapid guidance published in 2018, Kowalski and colleagues conducted a systematic survey of the methodologies and processes of rapid guideline-producing organizations. Nomenclature used to identify these documents varied by organization, from “rapid advice guideline” to “interim guidance” to “short clinical guideline.” While the quality of these documents as assessed with the AGREE II tool was variable, it was greater in documents from the WHO and NICE than it was for the CDC or other smaller organizations. While NICE guidelines were of higher quality as assessed with the domains of AGREE II, they took substantially more time to develop than those from WHO.

It's important to note that while terminology differs between organizations, the word "interim" has been used to connote a response to an emergent public health issue with shorter time frames than a rapid guideline - typically on the order of 1-3 weeks.

Organization	Nomenclature	AGREE II domain score range (lowest – highest)	Development timeline (from manual)
WHO	Rapid advice guideline	54 - 92	1-3 months
NICE	Short clinical guideline	81 - 94	11-13 months
CDC	Interim guidance	10 - 82	Not reported
Other	Interim guidelines, interim position statement, clinical guidelines	21 - 67	Not reported

Common Challenges and Facilitators to Rapid Guideline Development

While both the World Health Organization (WHO) and the National Institute for Health and Care Excellence (NICE) reported the use of a systematic review of the evidence to guide recommendations, common issues among developers included a lack of reporting on the management of conflict of interest, external review, and of the process for the drafting of recommendation.

In follow-up qualitative interviews with guideline-developing staff from WHO, participants cited a lack of adequate staffing, monetary resources, and evidence as key obstacles to the development of rapid guidelines. While the development of a systematic review is likely one of the more time-consuming elements of a rapid guideline process, most participants agreed that it is a fundamental part of developing trustworthy guidance that should not be skipped if possible.

Participants also indicated that the external/peer review process can add unwanted time to the development of rapid guidelines. To this effect, developers can consider limiting the reach of peer review to the final draft only as well as reducing the ability to drastically change recommendations in a way that would require a reconvention of the guideline panel. Virtual conferencing technology was named as a facilitator to developing guidelines on a quicker schedule by reducing the need for face-to-face meetings.

For a checklist to guide the development of rapid recommendations, see the G-I-N/McMaster checklist extension for rapid guidelines.

Kowalski, S.C., Morgan, R.L., Falavigna, M. et al. Development of rapid guidelines: 1. Systematic survey of current practices and methods. Health Res Policy Sys 16, 61 (2018).

Manuscript available at the publisher's website here.

Florez, I.D., Morgan, R.L., Falavigna, M. et al. Development of rapid guidelines: 2. A qualitative study with WHO guideline developers. Health Res Policy Sys 16, 62 (2018).

Manuscript available at the publisher's website here.

Thursday, March 26, 2020

Extremely Serious Research Short: GRADE’s terminology for rating down by three levels

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

Since the inception of GRADE two decades ago, GRADE methodology has needed to evolve along with the arrival of new ways of assessing the evidence. One such evolution has come with the introduction of methods for assessing risk of bias for non-randomized studies, such as the Risk Of Bias In Non-randomized Studies (ROBINS-I) and the RoB Instrument for Nonrandomized Studies of Exposures (ROBINS-E).

Because these tools assess the risk of bias in non-randomized studies as if they represent a pragmatic trial, they automatically begin from a lower risk of bias than alternative assessments such as the Newcastle-Ottowa Scale. When rating down in GRADE, however, non-randomized studies start as low certainty of evidence before any rating up or down occurs. This means that while a study assessed with ROBINS-I or E would start as high-quality evidence, it may require a reduction of three levels if very serious risk of bias is present. In other words, a reduction of three levels for a study assessed with ROBINS-I or E would be analogous to a two-level reduction for a non-randomized study assessed with another method.

A rating by any other name…

In order to determine what exactly this new three-level reduction should be called, members of the GRADE Working Group conducted a survey of 225 participants recruited via social media, the Guidelines International Network (G-I-N), and other sources. Just over one-third (34.2%) were members of the GRADE Working Group and all respondents had participated in guideline development in some capacity. The results are presented in a newly published article as part of a new “GRADE Notes” series in the Journal of Clinical Epidemiology.

Within the survey, participants were asked to rate the following terms for this novel three-level reduction, from least (1) to most-favored (4):

Critically serious
Extremely serious
Most serious
Very, very serious

Respondents' average ranking of terms.

T. Piggott et al. / Journal of Clinical Epidemiology - (2020)

“Extremely serious” took the lead as the most favorably ranked term with an average score of 3.19, with “critically serious” a close second at 3.12. Respondents found “extremely serious” the most agreeable due to its clarity and the fact that it seemed to “naturally” follow the existing two-level term, “very serious.”

The term “extremely serious” can now be found within the GRADEpro application when rating the certainty of evidence within non-randomized studies while utilizing the ROBINS-I or ROBINS-E instruments.

Piggott T, Morgan RL, Cuello-Garcia CA, Santesso N, Mustafa RA, Meerpohl JJ, Schünemann HJ, GRADE Working Group. GRADE notes: Extremely Serious, GRADE’s Terminology for Rating Down by 3-Levels. Journal of Clinical Epidemiology. 2019 Dec 19.

Manuscript available here on publisher's site.

Tuesday, March 10, 2020

Research Shorts: U.S. Guideline Developers Inconsistently Applying Criteria for Appropriate Evidence Grading

Contributed by Philipp Dahm, MD, MHSc, FACS

Guideline Developers in the United States were Inconsistent in Applying Criteria for Appropriate GRADE Use

Our study was motivated by the anecdotal observation that many US-based organizations appeared to be endorsing the GRADE approach but did not necessarily apply it to the fullest extent. We therefore sought to formally study this issue applying six published criteria of appropriate GRADE use. We limited to search to guidelines from US-based organizations that were included in the National Guideline Clearinghouse (NGC) which implied that they met certain, minimal criteria for evidence-based guidelines. Our search reached back to January 2011 and went to June 2018 after which time the NGCH lost its funding and stopped existing in that form.

Among guidelines documents from 315 organizations included in the database, 135 were from the US and were represented by at least one guideline. Our analysis ultimately included 67 guideline documents from 44 organizations. The vast majority of these guidelines were from professional organizations; mostly related to the field of internal medicine and its subspecialties. With regard to domains for rating the certainty of evidence, only one in 10 was explicit about including all five criteria for downgrading (study limitations, indirectness, inconsistency, imprecision, and publication bias) for a body of evidence from randomized trials and all three domains (large magnitude of effect, dose-response gradient, and direction of residual bias) for rating up a body of evidence from non-randomized trials. Over half of guidelines described explicit consideration of all four central domains (certainty of evidence, balance of benefits to harms, patients’ values and preferences and resource utilization) for moving from evidence to recommendations. All guidelines included the certainty of evidence and the vast majority also addressed the balance of desirable and undesirable consequences. When comparing guidelines published in 2011-2014 versus 2015-18, rates of appropriate use were higher for nearly all criteria, but only one main criterion met statistical significance, namely the reporting of evidence summaries supporting recommendations.

The take-home messages from this study are that one-in-three US based organizations developing evidence-based guidelines report the use of GRADE but that adherence to published criteria is quite inconsistent. As GRADE finds increasing uptake worldwide, continued efforts in training guideline methodologists and panel members will be important to assure appropriate application of GRADE methodology.

Dixon C, Dixon PE, Sultan S, Mustafa R, Morgan RL, Murad MH, Falck-Ytter Y, Dahm P. Guideline Developers in the United States were Inconsistent in Applying Criteria for Appropriate GRADE Use. Journal of Clinical Epidemiology. 2020 Mar 4.

Wednesday, February 19, 2020

Research Shorts: Informative statements to communicate the findings of reviews

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

When authors of systematic reviews utilize the GRADE approach to evaluate the certainty of evidence in their findings, they should present this information in a way that is clear, consistent, and useful to the reader. In a recent article from the GRADE series (GRADE guidelines 26) in the Journal of Clinical Epidemiology, Santesso and colleagues present recommendations for communicating the effect size and certainty of evidence within a systematic review. These statements were informed by years of research, feedback, and discussion, including the qualitative input of around 100 methodology experts and a survey of 110 respondents of diverse backgrounds and levels of GRADE expertise.

The final result was a table of suggested statements organized by the certainty of the effect followed by the size of that effect based on the point estimate. In order to use this tool, systematic review authors will need to first determine thresholds for the size of the effect (i.e., whether the effect on an outcome is trivial, small, moderate, or large, or if there is no effect). This can be accomplished in “full contextualization,” in which the outcome is considered in relation to all other critical outcomes, or “partial contextualization,” in relation to the standalone value of the single outcome.

The suggested statements generated from the table can be used throughout the text of a systematic review, from the abstract to the discussion, and as part of any review type, such as those examining the accuracy of test strategies. The included language is also simple enough to be included as part of a plain language summary or other consumer-facing materials.

Santesso N, Glenton C, Dahm P, Garner P, Akl E, Alper B, Brignardello-Petersen R, Carrasco-Labra A, De Beer H, Hultcrantz M, Kuijpers T Meerpohl J, Morgan R, Mustafa R, Skoetz N, Sultan S, Wiysonge C, Guyatt G, Schünemann HJ. GRADE guidelines 26: Informative statements to communicate the findings of systematic reviews of interventions. Journal of clinical epidemiology. 2019 Nov 9.

Manuscript available here on publisher's site.

Tuesday, February 11, 2020

Don’t Sell Your Guideline Short – Remember to Report! (Part 1)

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The development of a high-quality, evidence-based clinical guideline is no small feat. It requires significant time and effort from content experts, methodologists, and organizational staff and typically takes more than 1-2 years from start to finish.

Given the effort and hours that go into guideline development, it’s all too easy - and all too common - for the reporting of the development process of these guidelines to significantly undersell their quality. This is important, because published analyses assessing the quality of guidelines will likely only use what is reported or referenced in the text of the guideline. In other words, guidelines that do not adequately report on the methods they used to develop their recommendations will be under-appraised in the published literature – and this could lead to a gross underestimation of a guideline-developing organization’s work as a whole.

Quality and Reporting Standards: A Brief Review

Over the past decade, a number of standard sets, reporting checklists, and appraisal tools have been published to assist guideline developers in the reporting of their methods and to provide ways for researchers to assess the quality of these guidelines. These standards and methods of appraisal include but are not limited to:

The Appraisal of Guidelines for Research and Evaluation (AGREE) II tool (2010)
the National Academy of Medicine (formerly the Institute of Medicine [IOM]) Standards for Trustworthy Clinical Practice Guidelines (2011)
the Guideline International Network (G-I-N) Key Components of High-Quality and Trustworthy Guidelines (2012)
World Health Organization (WHO) Handbook for Guideline Development (2nd ed., 2014)
Reporting Items for practice Guidelines in HealThcare (RIGHT) Statement (2017)

Report, or it didn’t happen.

A guideline may be developed using the most water-tight, rigorous methods, but if these methods are not adequately described either in the text of the guideline or in a referenced external text, then an assessor will likely under-appraise the quality of a guideline. To ensure the most accurate appraisal of a guideline possible, guideline developers should consider the following helpful tips:

Create a guideline template including boilerplate text that meets as much reporting criteria as possible, such as a general description of the systematic review and recommendations development processes; competing interest statements for all involved authors and guideline panel members; a description of the method used to assess certainty of evidence and grade the strength of recommendations; and a clear table at the beginning of the document listing all clinical questions and resulting recommendations.
Maintain an up-to-date, in-depth description of the guideline development process on the website of the guideline-producing organization. Refer to this page specifically in the text of the guideline. This allows both guideline end-users and potential assessors to view the development process in depth without requiring too much space in the guideline document itself.
When in doubt, refer it out. If there are supplemental texts to the guideline that include information related to the development process – such as an underlying systematic review or a list of authors’ conflict of interest disclosures – make sure these documents are clearly referenced in the guideline text and made easily accessible in the online version via hyperlinks.
Don’t make assumptions. Even aspects of the development process that seem obvious, such as whether the guideline is externally reviewed, will likely not be included in a published quality assessment if it is not explicitly mentioned.
Always be specific. Do not make the end-user of a guideline have to guess who the guideline is for, the clinical questions driving the guideline, or the appropriate scenarios in which to employ the recommendations. Utilizing the PICO (Population, Intervention, Comparison, Outcome) format to explicitly describe the clinical questions and resulting recommendations is a failsafe way to ensure your guideline is specific enough to be useful.

Stay tuned for Part II where we provide a list of commonly overlooked items in published guidelines and discuss how to instantly improve the quality assessment of a guideline.

Monday, February 3, 2020

Research Shorts: From test accuracy to patient-important outcomes and recommendations

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The potential risks and benefits of a screening or diagnostic testing strategy extend beyond the immediate impact and accuracy of the test itself. The result of testing will determine the available next steps and options for follow-up and management, and therefore will affect various patient-important outcomes in addition to potential resource utilization and equity considerations. These downstream consequences, and the certainty of evidence in these consequences, need to be considered when formulating recommendations surrounding testing. In a July 2019 paper published as part 22 of the Journal of Clinical Epidemiology’s GRADE guidelines series, Schünemann and colleagues provide suggestions for assessing certainty of evidence and determining recommendations for diagnostic tests and strategies.

While a collection of randomized controlled trial evidence examining the downstream consequences of various testing strategies is ideal in this scenario, such data are sparse. In lieu of this, guideline authors should develop a framework that includes each possible testing and follow-up treatment scenario, starting with the test in question and ending with patient-important outcomes.

H.J. Schunemann et al. / Journal of Clinical Epidemiology 111 (2019) 69e82

As seen in this USPSTF sample framework, evidence begins with accuracy studies and ends with patient-important end-points.

This will allow the panel to visually link all relevant existing data together and develop clinical questions that are answerable with the evidence at hand. Data on the accuracy of a given test will help inform the expected number of false negatives and positives, which would then lead to potentially important downstream consequences - such as anxiety or a missed diagnosis - in addition to the effects of treating a diagnosed condition. The estimates of these beneficial and harmful potential outcomes should ideally come from a systematic review of evidence which can then be assessed for certainty.

H.J. Schunemann et al. / Journal of Clinical Epidemiology 111 (2019) 69e82

The authors suggest providing one overall rating of the quality of evidence that takes into account the certainty of the diagnostic, prognostic, and management data that are available. Guideline panels should determine which outcomes of these bodies of evidence are critical and ascribe an overall rating based on the lowest level of certainty of the critical outcomes.

Schünemann HJ, Mustafa RA, Brozek J, Santesso N, Bossuyt PM, Steingart KR, Leeflang M, Lange S, Trenti T, Langendam M, Scholten R. GRADE guidelines: 22. The GRADE approach for tests and strategies—from test accuracy to patient-important outcomes and recommendations. Journal of clinical epidemiology. 2019 Jul 1;111:69-82.

Manuscript available here on publisher's site.

Monday, January 20, 2020

Research Shorts: Assessing the certainty of evidence in the importance of outcomes or values and preferences

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The rating of outcomes in terms of their importance is a key aspect of GRADE guideline development. So is, of course, the rating of the certainty of evidence that will inform clinical decision-making. However, it is often difficult to rate the certainty of evidence of the importance of outcomes – assuming there is any evidence to draw from at all. In their July 2019 article published in the Journal of Clinical Epidemiology, Zhang and colleagues describe the ways to assess the certainty of a body of evidence used to determine the relative importance of outcomes.

The GRADE domains that present the most challenges when rating the certainty of evidence are inconsistency and imprecision. Assuming there is more than one study, assessment of inconsistency should include judging the amount of variance across studies’ reported importance of outcomes, exploring potential sources for this inconsistency (such as differences in populations or instruments used) and rating down when inconsistency is not explained by these. Imprecision should take into consideration the sample size first. In fact, in cases where there is no available quantitative synthesis, sample size may be the only consideration. In other cases, assuming information size meets a pre-defined threshold, the evidence may still be rated down if the confidence intervals of relative importance outcomes cross a pre-defined decision-making threshold.

Y. Zhang et al. (2019)/Journal of Clinical Epidemiology

The authors warn against attempts to rate the certainty of evidence in the variability of outcome importance – in other words, how much the perceived importance of any outcome varies from one individual to the next. If both inconsistency and imprecision are ruled out as potential sources of observed variance, then true variability may exist. In these cases, guideline panels should consider the formation of a conditional recommendation based on differences in values and preferences.

The article also provides guidance for assessing publication bias and rating up.

Zhang Y, Coello PA, Guyatt GH, Yepes-Nuñez JJ, Akl EA, Hazlewood G, Pardo-Hernandez H, Etxeandia-Ikobaltzeta I, Qaseem A, Williams Jr JW, Tugwell P. GRADE guidelines: 20. Assessing the certainty of evidence in the importance of outcomes or values and preferences—inconsistency, imprecision, and other domains. Journal of clinical epidemiology. 2019 Jul 1;111:83-93.

Manuscript available here on publisher's site.

Sunday, November 3, 2019

Fall 2019 - Scholarship recipients

Contributed by Madelin Siedler, 2019/2020 U.S. GRADE Network Research Fellow

The Eleventh GRADE Guideline Development Workshop was held in Orlando, Florida, this past September. This workshop welcomed 52 participants, including several international attendees from Canada and Brazil. Among these participants were three recipients of a scholarship provided by the Evidence Foundation, which covers the cost of registration. Coming from diverse backgrounds ranging from organizational work to evidence synthesis to policymaking, scholars Faduma Gure, Eric Linskens, and Christian Kershaw presented their proposals for new innovations or opportunities for improving the application and implementation of evidence-based medicine.

Faduma Gure, MSc, a knowledge translation and research specialist for the Association of Ontario Midwives, discussed the challenges of incorporating client perspectives to midwifery guidelines for the organization, which utilizes the GRADE approach. Pregnancy through the post-partum period is often a tumultuous time filled with decision-making, Gure explained, and more can be done to better understand the values and preferences of midwifery clients and to employ an equity lens when formulating recommendations. Gure proposed a solution that includes the development of an equity advisory group consisting of key stakeholders representative of Ontario’s population of midwifery clients. The organization could then involve these stakeholders through the entire guideline development process - from the initial setting of research priorities to the ultimate formulation of recommendations - and elicit important feedback about patient values and preferences as well as the potential impacts of a guideline on various communities.

Eric Linskens, BSc, serves the Minneapolis Veterans Affairs Evidence Synthesis Program and the Minnesota Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Center. Linskens presented his current work applying the GRADE approach to existing systematic reviews which did not originally use GRADE. Linskens discussed some of the challenges that his team has faced as part of the initiative, as well as innovative solutions to these issues. For instance, a systematic review conducted by AHRQ may automatically rate down for inconsistency due simply to the inclusion of a sole study, whereas in the GRADE approach, this would not be the case. Additionally, an existing review may break down one clinical question into multiple smaller analyses of comparators or sub-populations, whereas it would be more clinically relevant to use these data to create one larger recommendation. To best solve these issues, Linskens noted, it is important to consider the end-user of any given review or guideline so that their needs can be best met. Additionally, transparently reporting all judgments around the analyses is key. Regarding his time at the workshop, Linskens said, “[i]t was very helpful to work through examples with the GRADE workshop facilitators in small group sessions. They answered our questions as they came up.”

Christian Kershaw, PhD, is a molecular neuroscientist who now works as a health policy analyst for CGS, a Medicare fee-for-service contractor. Dr. Kershaw used her personal experience transitioning from bench science to policymaking to inspire her presentation on the utility of cross-functional teams in medicine and healthcare policy. To develop a cross-functional team, Dr. Kershaw explained, it is best to identify a problem that would best be solved by a group of individuals with heterogeneous skills and backgrounds that would each uniquely serve a common goal or purpose. As an example, Kershaw discussed the development of a team to standardize the way information is used to form coverage decisions as part of the 21st Century Cures Act. The team is comprised of a medical doctor to understand the need for and content of the policies; an outreach and education specialist to understand their legal implications; and a basic research scientist to compile and assess the information. Leveraging individual team members’ strengths and encouraging innovation are keys to success when working in a cross-functional team. “I was impressed with the versatility of the GRADE framework,” Kershaw noted. “It was very informative to learn all of the different ways that the conference attendees were using GRADE to suit their projects.”

If interested in applying for a scholarship to future GRADE workshops, more details can be found here: https://evidencefoundation.org/scholarships.html. Please note the deadline for applications to our next workshop in Phoenix, AZ will be December 4, 2019.

Wednesday, January 30, 2019

Stating the “Obvious”: A Primer on Good Practice Statements in GRADE Guidelines

Contributed by Madelin Siedler, 2018/2019 U.S. GRADE Network Research Fellow

Stating the “Obvious”: A Primer on Good Practice Statements in GRADE Guidelines

One of the benefits of the GRADE approach is that it provides a framework for the development of evidence-based recommendations that are clear and actionable for practicing clinicians even when only lower-quality evidence is available. However, in some particular instances, caution is warranted when developing a recommendation based on low-quality evidence or inference. “Good practice statements” are one such instance.

The term “good practice statement” is sometimes interchanged for “motherhood statement”.” In either case, the practice being recommended is usually something that is already commonly accepted as beneficial or practical advice. It could even be seen as irrefutably “good” as motherhood and apple pie (hence the term). The nature of these types of statements is such that the action is seen as so obviously beneficial that it would be unduly onerous to conduct a review to demonstrate its efficacy.

An example of a good practice statement is the first recommendation from the American Gastroenterological Association (AGA)’s 2015 guideline on the management of asymptomatic pancreatic cysts, which reads, “The AGA recommends that before starting any pancreatic cyst surveillance program, patients should have a clear understanding of programmatic risks and benefits” (Vege et al., 2015).

How to Spot a Good Practice Statement
An easy way to identify a good practice statement is to restate the recommendation as its inverse: for instance, “patients should not have a clear understanding of programmatic risks and benefits.” If this “unstated alternative” is absurd or clearly does not conform to ethical norms, the original statement is likely a good practice statement (Guyatt et al., 2016).

The Problem with Good Practice Statements
The GRADE Working Group recommends that good practice statements be used sparingly, if at all. Because good practice guidance is typically based on several linked sources of indirect evidence, there is no way to tell whether the benefits of the proposed recommended action are as truly obvious or incontestable as they seem. And if they are (if the inverse of the proposed recommendation would be absurd or unethical) then they are likely unwarranted and can dilute the strength of the guideline as a whole.

Sometimes, good practice statements may even appear as graded recommendations in a guideline (a decision that’s not recommended by the GRADE working group for the reasons above). In this case, the guideline authors may be tempted to make a strong recommendation based on low-quality evidence, which should ideally be a rare occurrence and based on well-defined criteria (Guyatt et al., 2015).

Practical Advice for Dealing with Good Practice Statements
The GRADE Working Group recommends using the following checklist to determine whether a good practice statement is warranted:

Is the statement clear and actionable?
Is the message really necessary in regards to actual health practice?
After consideration of all relevant health outcomes and potential downstream consequences, will implementing the good practice statement result in large net positive consequences?
Is collecting and summarizing the evidence a poor use of a guideline panel’s limited time and energy?
Is there a well-documented clear and explicit rationale connecting the indirect evidence?

If the answer is 'yes' to all five questions, a good practice statement may be warranted for inclusion in a guideline document. When done correctly, good practice advice statements should appear as ungraded recommendations, meaning no formal rating of quality of evidence or strength of recommendation should be given (Guyatt et al., 2015).

However, many potential good practice statements will be eliminated through the use of this checklist. For instance, careful consideration of Question #3 could lead the panel to realize that the assumed net positive of a specific action may not be so obvious after all. In this case, the guideline panel should consider whether a thorough review of the evidence should be conducted and formal grading methods applied.

Tuesday, January 22, 2019

Research Shorts: When continuous outcomes are measured using different scales

Contributed by M. Hassan Murad, MD

Outcomes of great importance to patients, such as quality of life and severity of anxiety or depression, are often measured using different scales. When an outcome is measured using several scales across trials, it requires standardization to be pooled in a meta-analysis.

Common methods of standardization include using the standardized mean difference (SMD), converting continuous data to binary relative and absolute association measures, the minimally important difference (MID), the ratio of means, and transforming standardized effects back to original scales. The underlying assumption in all these methods is that the different scales measure the same construct. This paper, in BMJ, describes these methods and suggests approaches for interpretation.

Reference: Murad Mohammad Hassan, Wang Zhen, Chu Haitao, Lin Lifeng. When continuous outcomes are measured using different scales: guide for meta-analysis and interpretation BMJ 2019; 364 :k4817. https://www.bmj.com/content/364/bmj.k4817

Wednesday, January 16, 2019

Fall 2018 Scholarship Recipients

Contributed by Madelin Siedler, 2018/2019 U.S GRADE Network Research Fellow

We were pleased to support the participation of three research scholars at our recent GRADE guideline development workshop held in Silver Spring, Maryland, October 17-19, 2018. By providing complementary registration, we hope that these scholars increased their understanding and application of the GRADE approach. As part of the U.S. GRADE Network/Evidence Foundation scholarship, recipients presented to fellow workshop participants about their interests and current endeavors in the field of guideline development.

Scholarship recipients: Oilvia Magwood, Mohamad Kalot, and Mohammed Alkhatib

A few details about our Fall 2018 scholarship recipients:

Mohamad Kalot, MD, a postdoctoral research fellow at University of Kansas Medical Center, Kansas City, Kansas, applied for the scholarship due to his interest in improving quality and decreasing disparities in healthcare through the development of evidence-based guidelines. Dr. Kalot presented on his current work in the development of guidelines for the management of rare diseases. Dr. Kalot explained how the development of such guidelines presents a unique challenge in that there is often a dearth of research on these populations, which can affect the directness of evidence among other factors.

“Since an important part of my conducted reviews and research deals with developing guidelines for rare diseases, I would ultimately face specific challenges in the process of assessing the certainty of evidence in my work, and I’m interested in exploring innovative methods to deal with these challenges,” said Kalot. “The GRADE workshop in Silver Spring didn’t only help me find solutions for my challenges, it made me see the research methodology world from a different, deeper and more practical perspective - especially with the very helpful tools [RevMan and GRADEpro Guideline Development Tool] that we learned about in the small groups sessions and the discussions about rating up and rating down the quality of evidence.”

Olivia Magwood, MPH, attended from the Buyere Research Institute in Ottawa, Ontario, where she has participated in the development of three national and international guidelines. Ms. Magwood presented on the development of the FACE (Feasibility, Acceptability, Cost, and Equity) Framework Stakeholder Survey. The FACE Stakeholder Survey aims to identify cognitive bias that may influence guideline development as well as inform the quality and impact and improve the uptake of evidence-based recommendations in various stakeholder groups.

“For me, the GRADE workshop highlighted the importance of building my network and having the right people on your team throughout the guideline development process,” explained Magwood. “This includes involving methodologists early and throughout guideline development, as well as emphasizing patient perspectives, especially while deciding on patient-important outcomes. Guideline development is truly a collaborative process that benefits from multi-stakeholder engagement.”

Mohammed Alkhatib, MD, a postdoctoral research fellow at University of Kansas Medical Center, Kansas City, Kansas, applied for the scholarship due to his interest in examining and refining the adoption, adaptation and de novo development (or “adolopment,” for short) of guidelines to be used in lower-resource settings. Dr. Alkhatib presented on the importance of this approach, as well as the unique challenges of tailoring the development of clinical recommendations to settings with limited resources, such as developing nations and areas of conflict.

“It was an exceptional experience for me to present my proposal in front of high-level scholars of guideline development and to listen to their positive feedback,” said Alkhatib. “[The] GRADE workshop was very helpful for me at the level of interpretation of [systematic reviews] and how to judge strengths and weaknesses using GRADEpro.”

**If interested in applying for a scholarship to future GRADE workshops, more details can be found here: https://evidencefoundation.org/scholarships.html. Please note the deadline for applications to our next workshop in Denver, Colorado will be January 1, 2019.

Monday, December 3, 2018

Research Shorts: Surrogate endpoints using the example of hepatitis C virus

Contributed by Claudia Dobler, MD, PhD

2018 U.S. GRADE Workshop Scholarship Recipient

Surrogate endpoints (for example laboratory or imaging results) are commonly used in clinical trials, as they require less participants and can be done in a shorter period of time compared to trials that use clinically important outcome measures such as mortality. When evidence for an intervention is almost exclusively based on trials that used surrogate outcomes, it is challenging for decision makers to determine the appropriateness of the use and reimbursement of an intervention. The common tendency in evidence-based medicine is to view results based on surrogate endpoints as less certain than results based on long term, final patient-important outcomes. The authors of this paper use the contemporary and highly debated example of the surrogate endpoint ‘sustained viral response’ (i.e., viral eradication considered to represent successful treatment) in patients treated for chronic hepatitis C virus infection to demonstrate how the validity of a surrogate endpoint can be critically appraised to assess the trustworthiness of the evidence and the implications for decision-making. They outline how the GRADE system for determining the certainty in the evidence can be used in situations where decisions for clinical practice and health policy have to be based on evidence that mainly comes from trials with indirect outcome measures.

Beyond assessing the quality of the evidence, potential benefits and harms of the intervention need to be weighed against each other and factors such as patient values, impact on healthcare equity, acceptability by patients and feasibility of the intervention need to be considered. The authors conclude that considering all these factors, a conditional recommendation for direct acting antiviral agents to treat chronic hepatitis C virus infection may be appropriate.

Reference: Dobler CC, Morgan RL, Falck-Ytter Y, Montori VM, Murad MH. Assessing the validity of surrogate endpoints in the context of a controversy about the measurement of effectiveness of hepatitis C virus treatment. BMJ evidence-based medicine 2018: 23(2): 50-53. https://ebm.bmj.com/content/22/6/199