Wednesday, July 1, 2020

A Not-So-Non-Event?: New Systematic Review Finds Exclusion of Studies with No Events from a Meta-Analysis Can Affect Direction and Statistical Significance of Findings

Studies with no events in either arm have been considered non-informative within a meta-analytical context, and thus have been left out of these analyses. A new systematic review of 442 such meta-analyses, however, reports that this practice may actually affect the resulting conclusions.

In the July 2020 issue of the Journal of Clinical Epidemiology, Xu and colleagues report their study of meta-analyses of binary outcomes in which at least one included study had no events in either arm. The authors then reanalyzed the data from 442 included papers taken from the Cochrane Database of Systematic Reviews, using modeling to determine the effect of reincorporating the excluded study.

The authors found that in 8 (1.8%) of the 442 meta-analyses, inclusion of the previously excluded studies changed the direction of the pooled odds ratio (“direction flipping”). In 12 (2.72%) of the meta-analyses, the pooled odds ratio (OR) changed by more than the predetermined threshold of 0.2. Additionally, in 41 (9.28%) of these studies, the statistical significance of that findings changed when assuming a p = 0.05 threshold (“significance flipping”). In most of these 41 meta-analyses, excluded (“non-event”) studies made up between 5 and 30% of the total sample size. About half of these alterations led to an expansion of the confidence interval; while in the other half, the incorporation of non-events reduced the confidence interval.

The figure above from Xu et al. shows the proportion of studies reporting no events within the meta-analyses that showed a substantial change in p value when these studies were included. The proportion of the total sample tended to cluster between 5 and 30%.

Post hoc simulation studies confirmed the robustness of these findings, and also found that exclusion of studies with no events preferentially affected the pooled ORs of studies that found no effect (OR = 1), whereas a large magnitude of effect was protective against these changes. The opposite was found for the effect of excluding studies with no events on the resulting p values (i.e., large magnitudes of effects were more likely to be affected whereas conclusions of no effect were protected).

In sum, though a common practice in meta-analysis, the exclusion of studies with no events in either arm may affect the direction, magnitude, or statistical significance of the resulting conclusions in a small but non-negligible number of analyses.

Xu, C., Li, L, Lin, L., Chu, H., Thabane, L., Zou, K., & Sun, X. Exclusion of studies with no events in both arms in meta-analysis impacted the conclusions. J Clin Epidemiol, 2020; 123: 91.99.

Manuscript available from the publisher's website here. 

Friday, June 26, 2020

CONSORTing with Incorrect Reporting?: Most Publications Aren’t Using Reporting Guidelines Appropriately, New Systematic Review Finds


Reporting guidelines such as PRISMA for systematic reviews and meta-analyses and CONSORT for randomized controlled trials are often touted as a way to improve the thoroughness and transparency of reporting in academic research. However, while intended as a guide for improving the reporting of research, a new systematic review of a random sample of different publication types found that in many cases, these guidelines were cited incorrectly as a way of guiding the design and conduct of the research itself, of assessing the quality of published research, or for an unclear purpose.

In the review published earlier this month, Caulley and colleagues worked with an experienced librarian to devise a systematic search strategy that would pick up on any publication citing one of four major reporting guidelines documents from inception to 2018: ARRIVE (used in in vivo animal research), CHEERS (used in health economic evaluations), CONSORT (used in randomized controlled trials) and PRISMA (used in systematic reviews and meta-analyses). Then, a random sample of 50 of each publication type were reviewed independently by two authors for their citation of the reporting guideline.

Overall, only 39% of the 200 reviewed items correctly stated that the guidelines were followed in the reporting of the study, whereas an additional 41% incorrectly cited the guidelines, usually by stating that they informed the design or conduct of the research. Finally, in 20% of the reviewed items, the intended purpose of the cited reporting guidelines was unclear.

Examples of appropriate, inappropriate, and unclear use of reporting guidelines provided by Caulley et al.
Between publication types, RCTs the most likely to appropriately cite the use of CONSORT guidelines (64%) versus 42% of economic evaluations correctly citing CHEERS, 28% of systematic reviews and meta-analyses appropriately discussing the use of PRISMA, and just 22% of in vivo animal research studies correctly citing ARRIVE.


In addition, the appropriate use of the reporting guidelines did not appear to increase as time elapsed since the publication of those guidelines.

The authors suggest that improved education about the appropriate use of these guidelines – such as the web-based interventions and tools that are available to those looking to use CONSORT - may improve their correct application in future publications.

Caulley, L., Catalá-López, F., Whelan, J., Khoury, M., Ferraro, J., Cheng, W., ... & Moher, D. Reporting guidelines of health research studies are frequently used in appropriately. J Clin Epidemiol, 2020; 122: 87-94. 

Manuscript available from the publisher's website here. 

Tuesday, June 23, 2020

Need for Speed: Documenting the Two-Week Systematic Review

In a recent post, we summarized a 2017 article describing the ways in which automation, machine learning, and crowdsourcing can be used to increase the efficiency of systematic reviews, with a specific focus on making living systematic reviews more feasible.

In a new publication in the May 2020 edition of the Journal of Clinical Epidemiology, Clark and colleagues incorporated automation in order to attempt systematic review that took no longer than two weeks from search design to manuscript submission for a moderately-sized search yielding 1,381 deduplicated records and eight ultimately included studies.

Spoiler alert: they did it. (In just 12 calendar days, to be exact).

Systematic Review, but Make it Streamlined

Clark et al. utilized some form of computer-assisted automation at almost every point in the project, including:
  • Using SRA word frequency analyzer to identify key terms that would be most helpful inclusions in a search strategy
  • Using hotkeys (custom keystroke shortcuts) within SRA Helper tool to more quickly screen items and search pre-specified databases for full texts
  • Using RobotReviewer to assist in risk of bias evaluation by searching for certain key phrases within each document

However, machines were only part of the solution. The authors also note the decidedly more human-based solutions that allowed them to proceed at an efficient clip, such as:
  • Daily, focused meetings between team members
  • Blocking off “protected time” for each team member to devote to the project
  • Planning for deliberation periods, such as decisions on screening conflicts, to occur immediately after screening so as to reduce the amount of time and energy devoted to “mental reload” and review of one’s previous decisions for context


All told, the final accepted version of the manuscript took 71 person-hours to complete – a far cry from a recently published average of 881 person-hours among conventionally conducted reviews.

Clark and colleagues discuss key facilitators and barriers to their approach as well as provide suggestions for technological tools to further improve the efficiency of SR production.

Clark, J., Glasziou, P., Del Mar, C., Bannach-Brown, A., Stehlik, P., & Scott, A.M. A full systematic review was completed in 2 weeks using automation tools: A case study. J Clin Epidemiol, 2020; 121: 81-90.

Manuscript avaliable from the publisher's website here.

Thursday, June 18, 2020

It’s Alive!: Pt. III: From Living Review to Living Recommendations

In recent posts, we’ve discussed how living systematic reviews (LSRs) can help improve the currency of our understanding of the evidence, as well as the efficiency with which the evidence is identified and synthesized through novel crowdsourcing and machine learning techniques. In the fourth and final installment of the 2017 series on LSRs, Akl and colleagues apply the LSR approach to the concept of a living clinical practice guideline.

As the figure below from the paper demonstrates, while simply updating an entire guideline more frequently (Panel B) reduces the number of out-of-date recommendations (symbolized by red stars) at any given time, it comes with a serious trade-off: namely, the high amount of effort and time required to continuously update the entire guideline. Turning certain recommendations into "living" models helps solve this dilemma between currency and efficiency.



Rather than a full update of an entire guideline and all of the recommendations therein, a living guideline uses each recommendation as a separate unit of update. Recommendations that are eligible to make the transition from “traditionally updated” to “living” include those that are a current priority for healthcare decision-making, for which the emergence of new evidence may change clinical practice, and for which new evidence is being generated at a quick rate.

The Living Guideline Starter Pack

Each step of a recommendation’s formation must make the transition to “living,” including:
  • A living systematic review
  • Living summary tables, such as Evidence Profiles and Evidence-to-Decision tables
  • Online collaborative table-generating software such as GRADEpro can be used to keep these up-to-date with the emergence of newly relevant evidence
  • A living guideline panel who can remain “on-call” to contribute to updates of recommendations with relatively short notice when warranted
  • A living pool of peer-reviewers who can review and provide feedback on updates with a quick turnaround time
  • A living publication platform, such as an online version that links back to archived versions, as well as “pushes” new versions to practice tools at the point of care.
Additional Resources
Further information and support for the development of LSRs, including updated official guidance, is provided on the Cochrane website.

Akl, E.A., Meerpohl, J. J., Elliott, J., Kahale, L. A., Schünemann, H.J., and the Living Sysematic Review Network. Living systematic reviews: 4. Living guideline recommendations. J Clin Epidemiol, 2017; 91: 47-53.

Manuscript available from the publisher's website here. 

Monday, June 15, 2020

It’s Alive! Pt. II: Combining Human and Machine Effort in Living Systematic Reviews

Systematic review development is known to be a labor-intensive endeavor that require a team of researchers dedicated to the task. The development of a living systematic review (LSR) that is continually updated as newly relevant evidence becomes available presents additional challenges. However, as Thomas and colleagues write in the second installment of the 2017 series on LSRs in the Journal of Clinical Epidemiology, we can make the process quicker, easier, and more efficient by harnessing the power of machine learning and “microtasks.”

Suggestions for improvements in efficiency can be categorized as either automation (incorporation of machine learning/replacement of human effort) or crowdsourcing (distribution of human effort across a broader base of individuals).

A diagram from Thomas et al. (2017) describes the "push" model of evidence identification that can help keep Living Systematic Reviews current without the need for repeated human-led searches.

From soup to nuts, opportunities for the incorporation of machine learning into the LSR development process include:


  • Continuous, automatic searches that “push” new potentially relevant studies out to human reviewers
  • Exclusion of ineligible citations through automatic text classification, reducing the number of items that require human screening with over 99% sensitivity
  • Crowdsourcing of study identification and "microtask" screening efforts such as Cochrane Crowd, which at the time of this blog’s writing had resulted in over 4 million screening decisions from over 17,000 contributors 
  • Automated retrieval of full text versions of included documents
  • Machine-based extraction of relevant data, graphs and tables from included documents
  • Machine-assisted risk of bias assessment
  • Template-based reporting of important items
  • Statistical thresholds that flag when a change of conclusions may be warranted
As technology in this field progresses, the traditionally duplicated stages of screening and data extraction may even be taken on by a computer-human pair, combining the ease and efficiency of automation with the “human touch” and high-level discernment that algorithms still lack.

Thomas, J.,  Noel-Storr, A., Marshall, I., Wallace, B., McDonald, S., Mavergames, C... & the Living Systematic Review Network. Living systematic reviews: 2. Combining human and machine effort. J Clin Epidemiol, 2017; 91: 31-37. 

Manuscript available from publisher's website here. 

Wednesday, June 10, 2020

It’s Alive! Pt. I: An Introduction to Living Systematic Reviews

As research output continues to rise, the systematic reviews charged with comprehensively identifying and synthesizing the evidence within them are becoming more quickly out-of-date. In addition, the formation of a systematic review team can be a lengthy process, and institutional memory of the project is lost when teams are disbanded after publication.

One solution to this problem is the concept of a living systematic review, or LSR. In the first installment of a 2017 series in the Journal of Clinical Epidemiology, Elliott and colleagues introduce the concept of an LSR and provide general guidance on their format and production.

What is a Living Systematic Review (LSR)?
An LSR has a few key components:
  • Based on a regularly updated search run with an explicit and pre-established frequency (at least once every six months) to identify any potentially relevant recent publications. 
  • Utilize standard systematic review methodology (different from a rapid review)
  • Most useful for specific topics:
    • that are of high importance to decision-making,
    • for which the certainty of evidence is low or very low (meaning our certainty of the effect may likely change with the incorporation of new evidence), and
    • for which new evidence is being generated often.


A figure from Elliott et al. (2017) provides an overview of the LSR development process, from protocol to regular searching and screening and incorporation and publication of new evidence.


LSRs from End to End
An LSR can either be started from scratch with the intention of regular screening and updating of evidence – in which case the protocol should specify these planned methods – or based upon an existing up-to-date systematic review, in which case the protocol should be amended to reflect these changes.

Due to their nature, the publication of LSRs requires the use of an online platform with linking mechanisms (such as CrossRef) or with explicit versions (such as the Cochrane database) that can be updated as soon as new evidence is incorporated.

When the certainty of evidence reaches a higher level, or if the generation of new evidence substantially slows, an LSR may be discontinued in favor of traditional approaches to updating.

Additional Resources
Further information and support for the development of LSRs, including updated official guidance, is provided on the Cochrane website.

Elliott, J.H., Synnot, A., Turner, T., Simmonds, M., Akl, E.A., McDonald, S... & Thomas, J. Living systematic review: 1. Introduction - the why, what, when, and how. J Clin Epidemiol, 2017; 91:23-30.

Manuscript available from the publisher's website here.


Friday, June 5, 2020

Research Revisited: 2014’s “Guidelines 2.0: Systematic Development of a Comprehensive Checklist for a Successful Guideline Enterprise”

While several checklists for the development and appraisal of specific guidelines had been developed by 2014, there had yet to be published a thorough and systematic resource for organizations to inform the actual day-to-day operations of a guideline development program. Noticing this need, Schünemann and colleagues pooled their professional experiences and contacts in the field in addition to conducting a systematic search for self-styled “guidelines for guidelines” and other guideline development handbooks, manuals, and protocols. The reviewers, in duplicate, extracted the key stages and processes of guideline development from each of these documents, compiling them together.

The result was the G-I-N/McMaster Guideline Development Checklist: an 18-topic, 146-item soup-to-nuts comprehensive manual spanning each part and process of a guideline development program, from budgeting and planning for a program to the development of actual guidelines to their dissemination, implementation, evaluation, and updating.
An overview of the steps and parties involved in the G-I-N/McMaster guideline development checklist.

The checklist also provides hyperlinks to tried-and-true online resources for many of these aspects, such as tips for funding a guideline program, tools for project management, topic selection criteria, and guides for patient and caregiver representatives.

Schünemann HJ, Wiercioch W, Etxeandia I, Falavigna M, Santesso N, Mustafa R, Ventresca M et al. Guidelines 2.0: Systematic development of a comprehensive checklist for a successful guideline enterprise. CMAJ 186(3): E123-E142.

Manuscript available for free here.