U.S. GRADE Network blog: methodology

Showing posts with label methodology. Show all posts

Sunday, March 24, 2024

Three's a Crowd: How to Deal with More than Two Arms in a Meta-Analysis

It is not uncommon to come across the following scenario: when conducting a meta-analysis between two arms (e.g., an active therapy vs. a placebo), the meta-analyst includes a study that actually included two active arms (e.g., two different doses of the same experimental drug vs. placebo, two different routes of administration, etc.) Let's say that both of these arms were relevant to the clinical question. How should meta-analysis be undertaken in this case? A new tutorial article published in Cochrane Evidence Synthesis and Methods gives a primer on how to approach this common conundrum.

Including the study twice in the forest plot – for instance, with one dose versus placebo and the other versus the same placebo group – is statistically problematic. It leads to a "unit of analysis" error by essentially "double-counting" the participants in the control group and violating the assumption that every individual participant is only counted twice. (Aside: this is also a common error in meta-analyses combining multiple similar outcomes – e.g., including the handgrip strength of both the dominant and non-dominant hand in the same forest plot – and risks committing the same violation unless advanced multi-level statistical techniques are used to account for this).

This leaves two basic options for including the data from more than two study arms into the same forest plot: combining interventions that are similar, and splitting the control group in half. For instance, if the three groups in question are as such (assuming a dichotomous outcome):

Experimental group A: 50 participants, 45 of whom had the event.
Experimental group B: 50 participants, 41 of whom had the event.
Control group: 50 participants, 22 of whom had the event.

These two experimental groups can either be combined (100 participants, 89 of whom had the event) and compared to the control group as-is, or they can be split out and compared (on separate lines of the forest plot) to half of the control group on each line:

Experimental group A (45 out of 50) versus control (11 out of 25)
Experimental group B (41 out of 50) versus control (11 out of 25).

Both approaches will yield very similar pooled results.

In the case of a continuous outcome, the same general approaches can be applied. However, if pooling two or more arms together, a pooled mean and SD will need to be calculated using the following equations from the Cochrane Handbook:

Reference: Axon, E., Dwan, K., & Richardson, R. (2023) Multiarm studies and how to handle them in a meta-analysis: A tutorial. Cochrane Evidence Synthesis and Methods. Available at publisher's website here.

Wednesday, September 21, 2022

Standardized Mean Difference Estimates Can Vary Widely Depending on the Methods Used, New Review Finds

In meta-analyses of outcomes that utilize multiple scales of measurement, a standardized mean difference (SMD) may be used. Randomized controlled trials may also use SMDs to help interpret the effect size for readers. Most commonly, the SMD reports the effect size with Cohen's d, a metric of how many standard deviations are contained in the mean difference within or between groups (e.g., an intervention caused the outcome to increase or decrease by x number of standard deviations, or the two groups were x number of standard deviations different from one another with regards to the outcome). This is typically done by dividing the difference between groups, or from pretest to posttest in a single group, by some form of standard deviation (e.g., pooled standard deviation at baseline, posttest, or the standard deviation of change scores). Cohen's d is often utilized because a general rule of interpretation has been suggested: 0.2 is a small effect, 0.5 is a medium-sized effect, and 0.8 is large.

However, there are multiple ways to approach the calculation of SMDs, and these may result in varying interpretations of the size of the effect. To further investigate this, Luo and colleagues recently published a review of 161 articles using SMDs and the way they can be calculated. Of the 161 randomized controlled trials published since 2000 and reporting outcomes with some form of SMD, the authors calculated potential between-group SMDs using reported data and up to seven different methodological approaches.

Some studies reported more than one type of SMD, meaning that 171 total SMD approaches were reported across the 161 studies. Of these, 34 (19.9%) did not describe the chosen method at all, 84 (49.1%) reported but in insufficient detail, and 53 (31%) reported the approach in sufficient detail. The confidence interval was only reported for 52 (30.4%) of SMDs. Of the 161 individual articles, the rule for interpretation was clearly stated in only 28 (17.4%).

The most common method of calculating SMD was using a standard deviation of baseline scores, seen in 70 (40.9%) of studies. Meanwhile, 30 (17.5%) used posttest standard deviations and 43 (25.1%) used the standard deviation of change scores.

Figure displaying the variability of SMD estimates across 161 included studies. Click to enlarge.

Of all the potential ways to calculate SMD, the median article varied by 0.3 - which could potentially be the difference between a "small" and "moderate" or "between a "moderate" and "large effect size for Cohen's d using Cohen's suggested rule of thumb. The studies with the largest variation tended to have smaller sample sizes and greater reported effect sizes.

This work raises an important point, which is that while no one method for the calculation of SMDs is considered superior to another, if calculation approaches are not prespecified by researchers, different methods could be tried until the most impressive effect size is reached. To help prevent these issues, the authors suggest prespecifying the analytical approach and reporting SMDs together with raw mean differences and standard deviations to further aid interpretation and provide context.

Luo, Y., Funada, S., Yoshida, K., et al. (2022). Large variation existed in standardized mean difference estimates using different calculation methods in clinical trials. J Clin Epidemiol 149: 89-97. Manuscript available at the publisher's website here.

Tuesday, May 31, 2022

What is a Scoping Review, Exactly? JBI Provides a Formal Definition in New Publication

A scoping review, by any other name, would be as broad...

Multiple definitions of "scoping review" have been used in the literature and, according to a new paper by Munn and colleagues, the use of these reviews is increasingly common. Therefore, the Joanna Briggs Institute recently released a definition of the term as well as guidance on its proper application in evidence synthesis.

The paper, published in April, formally defines a scoping review as "a type of evidence synthesis that aims to systematically identify and map the breadth of evidence available on a particular topic, field, concept, or issue, often irrespective of source (ie, primary research, reviews, non-empirical evidence) within or across particular contexts." Further, the paper details, scoping reviews "can clarify key concepts/definitions in the literature and identify key characteristics or factors related to a concept, including those related to methodological research."

Scoping reviews are similar to other types of articles within the broader family of evidence syntheses, and should ideally include important characteristics such as the use of pre-specified protocols, question(s), and inclusion/exclusion criteria; a comprehensive search; more than one author; and adherence to guidelines such as the PRISMA statement.

Because the main purpose of a scoping review is the explore and describe the breadth of evidence on a topic, pre-specified questions are usually broader in scope, and the evidence base often includes multiple types of evidence based on what is available. Beyond simply mapping the existing literature, however, scoping reviews may also be used to identify or clarify key concepts used in the field or to examine how research is typically conducted in the area.

Munn, Z, Pollock, D., Khalil, H., et al. (2022). What are scoping reviews? Providing a formal definition of scoping reviews as a type of evidence synthesis. JBI Evid Synth 20(4): 950-952. Manuscript available at publisher's website here.

Monday, April 11, 2022

It's Alive! Pt. IV: Results from a Trainee Living Systematic Review Experience

Living systematic reviews (LSRs) continue to be a topic of interest among systematic review and guideline developers, as evidenced by our history posts on the topic here, here, and here. While automation and machine learning have begun to help facilitate what is a generally time- and resource-intensive process to evidence syntheses perpetually up-to-date, some aspects of LSR development still require the human touch. Now, a recently published mixed-methods study discusses the successes and challenges of utilizing a crowdsourcing approach to keep the LSR wheels turning.

The article describes the process of involving trainees in the development of a living systematic review and network meta-analysis (NMA) on drug therapy for rheumatoid arthritis. In their report, the authors posit that evidence-based medicine is a key pillar of learning for trainees, but that they may learn better through an experiential rather than a purely didactic approach; providing the opportunity to participate in a real-life systematic review may provide this experiential learning.

In short, the team first applied machine learning to sort through an initial database to filter out randomized controlled trials, which was then further assessed by a crowdsourcing platform, Cochrane Crowd. Next, trainees ranging from undergraduate students to practicing rheumatologists and researchers recruited through Canadian and Australian rheumatology mailing lists further assessed articles for eligibility and extracted data from included articles.

Training included a mix of online webinars, one-on-one trainings, and handbook provisions. Conflicting results were further assessed by an expert member of the team. The authors then elicited both quantitative and qualitative feedback about the trainees' experiences of taking part in the project through a combination of electronic survey and one-on-one interviews.

Overall, the 21 trainees surveyed rated their training as adequate and experience generally positive. Respondents specifically listed better understanding of PICO criteria, familiarity with outcome measures used in rheumatology, and the assessment of studies' risk of bias as the greatest learning benefits obtained.

Of the 16 who participated in follow-up interviews, the majority (94%) described a practical and enjoyable experience. Of particular positive regard was the use of task segmentation throughout the project, during which specific tasks (i.e., eligibility assessment versus data extraction) could be "batch-processed," allowing trainees to match the specific time and focus demands to the selected task at hand. Trainees also communicated an appreciation for the international collaboration involved in the review as well as the feeling of meaningfully contributing to the project.

Notable challenges included issues related to the clarity of communication regarding deadlines and expectations, as well as technical glitches experienced through the platforms used for screening and extraction. Though task segmentation was seen as a benefit, it also included drawbacks: namely, the risk of more repetitive tasks such as eligibility assessment becoming tedious while others that require more focus (i.e., data extraction) may be difficult to integrate into an already-busy daily schedule. To address these issues, the authors suggest improving communications to include regular, frequent updates and deadline reminders, working through technological glitches, and carefully matching tasks to the specific skillsets and availabilities of each trainee.

Lee, C., Thomas, M., Ejaredar, M., Kassam, A., Whittle, S.L., Buchbinder, R., ... & Hazlewood, G.S. (2022). Crowdsourcing trainees in a living systematic review provided valuable experiential learning opportunities: A mixed-methods study. J Clin Epidemiol (in-press). Manuscript available at the publisher's website here.

Tuesday, May 4, 2021

Restricting Systematic Search to English-only is a Viable Shortcut in Most, but Perhaps Not All Topics in Medicine

In the limitations sections of systematic reviews on any topic, it is not uncommon for the authors to discuss how language limitations within their search may have restricted the breadth of evidence presented. For instance, if the reviewers speak only English, the review is likely limited to publications and journals in that language. But how much of a difference does such a limitation make in terms of the overall conclusions of a systematic review? According to a new paper in the Journal of Clinical Epidemiology, probably not much - but it may depend on the specific topic of medicine under investigation.

While other methods reviews have previously examined this question, Dobrescu and colleagues extended the range of topics to methods reviews that included systematic reviews within the realm of complementary and alternative medicine, yielding four reviews previously unexamined by prior studies. Specifically, the authors looked for methods reviews comparing the restriction of literature searches to English-only versus unrestricted searches and whose primary outcomes compared differences in treatment effect estimates, certainty of evidence ratings, or conclusions based on the language restrictions enforced.

The search yielded eight studies investigating the impact of language restrictions in anywhere from 9 to 147 systematic reviews in medicine. Overall, the exclusion of non-English articles had a greater impact on estimates of treatment effects and the statistical significance of findings in reviews of complementary and alternative medicine versus conventional medicine topics. Most commonly, the exclusion of non-English studies led to a loss of statistical significance in these topic areas.

Overall, the methods studies examined found that the exclusion of non-English studies of conventional medicine topics led to small to moderate changes in the estimate of effect; however, exclusion of non-English studies shrank the observed effect size in complementary and alternative medicine topics by 63 percent. Two studies examined whether language restricted influenced authors' overall conclusions, generally finding no effect.

The figure above shows the frequency of languages of the excluded reviews examined.

The authors conclude that when it comes to systematic reviews of conventional medicine topics, their findings are in line with those of previous methods studies which demonstrate little to no effect of language restrictions and suggest that restricting a search to English-only should not greatly impact the findings or conclusions of a review. However, the effect appears greater in the realm of complementary and alternative medicine, perhaps due to the greater proportion of non-English studies published in this field. Thus, systematic reviewers attempting to synthesize the evidence on an alternative medicine topic should be cognizant of their choices regarding language restriction and the potential implications they may have on their ultimate findings.

Dobrescu A, Nussbaumer SB, Klerings I et al. (2021). Restricting evidence syntheses of interventions to English-language publications is a viable methodological shortcut for most medical topics: A systematic review: Excluding English-language publications a valid shortcut. J Clin Epidemiol, epub ahead of print.

Manuscript available from publisher's website here.

Wednesday, April 21, 2021

In Studies of Patients at High Risk of Death, More Explicit Reporting of Functional Outcomes is Needed

Randomized controlled trials examining the effects of an intervention in patients with a high risk of death will often also include functional outcomes - such as quality of life, cognition, or physical disability. However, the death of patients before these outcomes can be assessed (also known as "truncation due to death") can confound the results of a "survivors-only" analysis, especially if mortality rates are higher in certain groups than others.

A new methodology review of studies published within 5 high-impact general medical journals from 2014 to 2019 provides insight into this phenomenon and suggestions for improving how functional outcomes are handled. To be eligible for the review, a study needed to be a randomized controlled trial (RCT) with a mortality rate of at least 10% in one arm and to report at least one functional outcome in addition to mortality. The authors recorded the outcomes analyzed, the type of statistical analyses used, and the sample population of each of the 434 included studies. For most (351, or 79%) of these, function was a secondary outcome, while it was a primary outcome for 91 (21%) of them.

Only one-quarter (25%) of the functional outcomes within the studies that examined them as secondary outcomes used an approach that included all randomized patients (intention-to-treat); for the studies for which functional outcomes were the primary outcomes analyzed, this proportion was 60%.

The authors provide suggestions for best ways to handle and report data in these studies:

In the methods rather than only in tables or supplementary material, explicitly state the sample population from which the functional outcomes were drawn, whether it's survivors-only or another type of analysis.
If a survivors-only analysis is used, the authors should report the baseline characteristics between the groups analyzed and transparently discuss this as a limitation within the discussion section.
If all randomized participants are analyzed regardless of mortality, authors should report the assumptions upon which these analyses are based; for instance, if death is one outcome ranked among others in a worst-rank analysis, the justification for the ranking of outcomes should be discussed in the methods, and the implications of these decisions included in the discussion section.

Colantuoni E, Li X, Hashem MD et al. (2021). A structured methodology review showed analyses of functional outcomes are frequently limited to "survivors only" in trials enrolling patients at high risk of death. J Clin Epidemiol (e-pub ahead of print).

Manuscript available here.

Wednesday, March 3, 2021

Dealing with Zero-Events Studies in Meta-analysis: There's a Better Way than Throwing it Away!

When meta-analyzing data from studies examining the incidence of rare events - or those with a small sample size or short follow-up period, it is not uncommon to come across a study with 0 events of the outcome of interest. In fact, approximately one-third of a random sample of 500 Cochrane reviews contained at least one zero-events study.

Zero-events studies are typically categorized as single-arm (there are 0 events reported in just one group) or double-arm (there are 0 events reported in both groups). While some software automatically discard double-arm zero-events studies from a meta-analysis, this is not ideal because these data still add useful information in regards to the overall effect of an intervention. Ideally, meta-analyses could include a pooled event count that may be zero in one arm, both arms, or neither, with various single-arm and double-arm zero-events studies potentially contributing to this final effect. Thus, in a recently published article, Xu and colleagues propose a more detailed framework for approaching zero-events studies in the context of a meta-analysis.

The authors describe six classifications as follows, with the degree of difficulty when meta-analyzing generally increasing from 1 to 6:

1) MA-SZ: meta-analysis contains zero-events only occurring in single arms, no double-arm-zero-events studies are included, and the total events count in neither arm is zero;

2) MA-MZ: meta-analysis contains zero-events occurring in both single and double arms, and the total events count in neither arm is zero;

3) MA-DZ: meta-analysis contains zero-events only occurring in double arms, and the total events count in neither arm is zero;

4) MA-CSZ: meta-analysis contains zero-events occurring in single arms, and no double-arm-zero-events studies are included, while the total events count in one of the arms is zero;

5) MA-CMZ: meta-analysis contains zero-events occurring in both single arm and double arms, while the total events count in one of the arms is zero;

6) MA-CDZ: meta-analysis only includes double-arm-zero-events studies, while the total events count in both arms are zero

The authors examined data from the Cochrane Database of Systematic Reviews (CDSR), including any review published between January 2003 - May 2018 and meta-analyzing at least two studies. Of the 61,090 reviews identified with binary outcomes, 21,288 (34.85%) contained at least one zero-events study. In a great majority (90.7%) of these, the total event count was greater than zero for both arms and the meta-analysis only included single-arm rather than double-arm zero-events studies. Second most common (6.21%) was the MA-CSZ, in which the total event count includes one arm with zero events, and the zero-events studies included are only single-arm. All others of the four remaining categories each made up less than 1.5% of the whole.

The authors propose that those looking to meta-analyze studies that include zero events first categorize their specific subtype, and then work through one of the suggested methods in the figure below. Finally, a sensitivity analysis should be used following an alternative method to determine the robustness of the results.

Xu C, Furuya-Kanamori L, Zorzela L, Lin L, and Vohra S. (2021). A proposed framework to guide evidence synthesis practice for meta-analysis with zero-events studies. J Clin Epidemiol, in-press.

Manuscript available from the publisher's website here.

Friday, February 12, 2021

Common Challenges Faced by Scoping Reviewers and Ways to Solve Them

Scoping reviews provide an avenue for the exploration, description, and dissemination of a body of evidence before a more systematic review is undertaken. As such, they can help clarify how research on a certain has been defined and conducted, in addition to identifying common issues and knowledge gaps - all of which can go on to inform a more effective approach to systematically reviewing the literature.

The Joanna Briggs Institute (JBI) has provided guidance on the conduct of scoping reviews since 2013. While developing the latest version published in 2020, the group identified the most common challenges and posed some solutions for those looking to develop a scoping review.

Key challenges included:

a lack of people trained in methodology unique to scoping reviews (helpful resources can be found on the JBI Global page and elsewhere).
how to decide when a scoping review is appropriate (hint: they should never be done in lieu of a systematic review if the intention is to provide recommendations)
deciding which type of review is most appropriate (this online tool can help)
knowing how much and what type of data to extract - for instance, making determinations between "mapping" of concepts around particular areas, populations, or methodologies and conducting a qualitative thematic analysis
reporting results effectively, such as with an evidence gap map
resisting the urge to overstate conclusions and provide recommendations for practice
a lack of editors and peer reviewers adequately trained to critically revise scoping reviews (the PRISMA extension for scoping reviews - PRISMA ScR - provides a checklist for proper conduct and reporting).

Khalil H., Peters M.D.J., Tricco A.C., et al. (2021). Conducting a high quality scoping review: Challenges and solutions. J Clin Epidemiol 130:156-160.

Manuscript available from publisher's website here.

Tuesday, December 8, 2020

No Single Definition of a Rapid Review Exists, but Several Common Themes Emerge

"The only consensus around [a rapid review] definition," write Hamel and colleagues in a review published in the January 2021 issue of the Journal of Clinical Epidemiology, "is that a formal definition does not exist."

In their new review, Hamel et al. sifted through 216 rapid reviews and 90 methodological articles published between 2017 and 2019 to better understand the existing definitions and use of the term "rapid review," identifying eight common themes among them all.

The figure below from the publication shows the relative usage of these themes throughout the relevant identified articles.

In summary of all definitions examined in the review, the authors suggest the following broad definition of a rapid review: "a form of knowledge synthesis that accelerates the process of conducting a traditional systematic review through streamlining or omitting a variety of methods to produce evidence in a resource-efficient manner."

To complicate matters further, Hamel and colleagues also found that reviews meeting these general criteria may not always go by the term "rapid." For instance, the term "restricted review" fits many of these same parameters, but is not necessarily defined by the amount of time from inception to publication. However, the lack of an as-yet agreed-upon definition of a "rapid review" may ultimately hamper authors and potential end-users of these products, as the accepted legitimacy of such reviews may depend upon a common understanding of their standards and methodological frameworks. In addition, the range of rigor and specific protocols continues to vary widely between products labeled as "rapid reviews." Until there is a broader consensus of the definition of a rapid review and what, exactly, it entails, this working definition and associated themes provide insight into the current state of the art.

Check out our related post on the two-week systematic review here.

Hamel C, Michaud A, Thuku M, Skidmore B, Stevens A, Nussbaumer-Streit B, and Garritty C. (2020). Defining rapid reviews: A systematic scoping review and thematic analysis of definitions and defining characteristics of rapid reviews. J Clin Epidemiol 129: 74-85.

Manuscript available from the publisher's website here.

Tuesday, October 6, 2020

New Study Examines the Impact of Abbreviated vs. Comprehensive Search Strategies on Resulting Effect Estimates

It's common practice - indeed, it's widely recommended - that systematic reviewers search multiple databases in addition to alternative sources of data such as the grey literature to ensure that no relevant studies are left out of analysis. However, meta-research on whether this theory holds up in practice is mainly limited to examinations of recall - in other words, reporting how many potentially relevant studies are picked up by an abbreviated search method as opposed to one that's more extensive. What's missing from this body of research, write Ewald and colleagues in a newly published study, is that recall studies compare items retrieved in absolute terms without considering the final weight or importance of each individual study - variables which will ultimately affect the direction, magnitude, and precision of the resulting effect estimate. Since larger studies with more caché are likely to have the greatest impact on the final estimate and certainty of evidence - and these studies are more likely to be picked up in even an abbreviated search - the added value of utilizing more extensive search strategies on a meta-analysis is left unclear.

To examine the impact of the extensiveness of a search strategy on resulting findings and certainty of evidence, the authors randomly selected 60 Cochrane reviews from a range of disciplines for which certainty of evidence assessments and summaries of findings were available. Thirteen reviews did not report at least one binary outcome, leaving a total of 47 for analysis. They then replicated these reviews' search strategies in addition to conducting 14 abbreviated searches for each review (e.g., MEDLINE only), such as limiting to one database or a combination of just two or three (e.g., MEDLINE and Embase only). Finally, meta-analyses were replicated for each of these scenarios, leaving out studies that would not have been picked up in the various abbreviated search strategies.

Searching only one database led to a loss of at least one trial in half of the reviews, and a loss of two trials in one-quarter of them. As may be expected, the use of additional databases reduced the loss of information. Overall, however, the direction and significance of the resulting effect estimates remained unchanged in a majority of the cases, as shown in Figure 1 from the paper, below.

Click to enlarge.

The use of abbreviated searches did, however, introduce some amount of imprecision, typically increasing standard error by around 1.02 to 1.06-fold. The inclusion of multiple versus a single database did not clearly appear to improve precision compared to a comprehensive search.

The authors note that these findings are particularly applicable to authors of potential rapid reviews and guidelines, where a consideration of trade-offs between speed and thoroughness is of great importance. Rapid reviewers should be aware that limiting search strategy may change the direction of an effect estimate or render an effect estimate uncalculable in up to one in seven instances, but this should be weighed against the benefits of a quicker time to the dissemination of findings, especially during emergent health crises where time is of the essence.

Ewald, H., Klerings, I., Wagner, G., Heise, T.L., Dobrescu, A.I., Armijo-Olivo, S., ... & Hemkens, L.G. (2020). Abbreviated and comprehensive literature searches led to identical or very similar effect estimates: A meta-epidemiological study. J Clin Epidemiol 128:1-12.

Manuscript available from publisher's website here.

Thursday, September 24, 2020

Pre-Print of PRISMA 2020 Updated Reporting Guidelines Released

Upon their publication in 2009, the PRISMA guidelines have become the standard for reporting in systematic reviews and meta-analyses. Now, 11 years later, the PRISMA checklist has received a fresh facelift for 2020 that incorporates the methodological advances that have taken place over the intervening years.

In a recently released pre-print, Page and colleagues describe their approach to designing the new and improved PRISMA. Sixty reporting documents were reviewed to identify any new items deserving of consideration and 110 systematic review methodologists and journal editors were surveyed for feedback. The new PRISMA 2020 draft was then developed based on discussion at an in-person meeting and iteratively revised based on co-author input and a sample of 15 experts.

The result is an expanded, 27-item checklist replete with elaboration of the purpose for each item, a sub-checklist specifically for reporting within the abstract, and revised flow diagram templates for both original and updated systematic reviews. Here are some of the major changes and additions to be aware of:

Recommendation to present search strategies for all databases instead of just one.
Recommendation that authors list "near-misses," or studies that met many but not all inclusion criteria, in the results section.
Recommendation to assess certainty of synthesized evidence.
New item for declaration of Conflicts of Interest.
New item to indicate whether data, analytic code, or other materials have been made publicly available.

Page, M., McKenzie, J., Bossuyt, P., Boutron, I., Hoffman, T., Mulow, C., ... & Moher, D. 2020. The PRISMA 2020 Statement: An updated guideline for reporting systematic reviews.

Pre-print available from MetaArXiv here.

Wednesday, September 2, 2020

A New Tool for Assessing the Credibility of Effect Modification Cometh: Introducing the ICEMAN

Effect modification goes by many other names: “subgroup effect,” “statistical interaction,” and “moderation,” to name a few. Regardless of what it’s called, the existence of effect modification in the context of an individual study means that the effect of an intervention varies between individuals based on an attribute such as age, sex, or severity of underlying disease. Similarly, a systematic review may aim to identify effect modification between individual studies based on their setting, year of publication, or methodological differences (often called a “subgroup analysis”).

As many as one-quarter of randomized controlled trials (RCTs) and meta-analyses examine their findings for potential evidence of effect modification, according to a paper by Schandelmaier and colleagues published in the latest edition of CMAJ. However, it is not uncommon for claims of effect modification to be later proved spurious, which may negatively affect the quality of care in those subgroups of patients. Potential sources of these claims range from simple random chance to issues with selective reporting and misguided application of statistical analyses.

Click to enlarge.

In “Development of the Instrument to assess the Credibility of Effect Modification in Analyses (ICEMAN) in randomized controlled trials and meta-analyses,” the authors present a novel tool for evaluating the presence of a potential modifier. While several sets of criteria have been developed in the past for this purpose, the ICEMAN is the first to be based on a rigorous development process and refined with formal user testing.

First, the authors conducted a systematic survey of the literature to ensure a comprehensive understanding of the previously proposed criteria for evaluating effect modification. Thirty sets were identified, none of which adequately reflected the authors’ conceptual framework. Second, an expert panel of 15 members was identified randomly from a list of 40 identified through the systematic survey. These experts then pared down the initial list of 36 candidate criteria to 20 required and eight optional items. After developing a manual for its use, the authors tested the instrument among a diverse group of 17 potential users, including authors of Cochrane reviews and RCTs and journal editors using a semi-structured interview technique.

Schandelmaier, S., Briel, M., Varadhan, R., Schmid, C.H., Devasenapathy, N., Hayward, R.A., Gagnier, J., ... & Guyatt, G.H. 2020. Development of the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN) in randomized controlled trials and meta-analyses. CMAJ 192:E901-906.

Manuscript available at the publisher's website here.

Monday, July 20, 2020

New Review Provides Insight into Unique Challenges of Continuous and TTE Outcomes, Potential Solutions

Health Technology Assessments (HTAs) and guidelines often meta-analyze non-binary outcomes, such as continuous and time-to-event outcomes, in order to elucidate the observed effect of a health intervention. However, these types of outcomes may require more sophisticated analysis and modeling techniques, making it more difficult to be synthesized by authors with limited statistical knowledge or resources.

A newly published review by Freeman and colleagues aimed to describe the use and presentation of these outcomes, and in doing so, identify potential challenges and facilitators to improving their application in future publications. The study analyzed a total of 25 technology appraisals and 15 guidelines from the UK’s National Institute for Health Care Excellence (NICE) and 7 HTA reports from the National Institute of Health Research (NIHR) for a total of 47 documents using meta-analyses (MA), network meta-analyses (NMA), or a combination of the two.

About half (51%) of the items reported at least one continuous outcome, while just over half (55%) reported at least one time-to-event outcome. Continuous outcomes were most commonly presented as a mean difference (MD). The most commonly used time-to-event outcomes were overall and progression-free survival, presented as a hazard ratio. Notably, no articles reported the methods used to handle multiplicity of either continuous or time-to-event outcomes. The existence of multiple time-points was largely handled by presenting multiple separate meta-analyses analyzing the appropriate time-points against one another.

Although most of the analyzed documents provided a decision model based on continuous or time-to-event outcomes, but many of them were based on the results of a single trial only, despite the fact that meta-analyses were undertaken.

Reporting of Decision Models Across Publications Using Continuous Outcomes. Click to enlarge.

Reporting of Decision Models Across Publications Using Time-to-Event Outcomes. Click to enlarge.

The authors present a list of the key challenges faced by authors of meta-analyses using these outcomes, such as the use of continuous outcomes that are reported with different scales, the multiplicity of related outcomes from the same study or various time-points in time-to-event outcomes, and nonproportional hazards (hazards that change over the course of time) in time-to-event outcomes. They present the following suggestions for better managing these issues:

Increased availability of statistical expertise on MA and NMA teams
Development of user-friendly software that allows users to approach more complex statistical techniques – for instance, those that allow for multiple outcomes from the same study to be analyzed simultaneously – with the same ease and accessibility as a point-and-click software such as RevMan.
Increased reporting of outcomes within individual trials, as well as the reporting of individual patient data by trial authors.

Freeman, S.C., Sutton, A.J., & Cooper, N.J. Update of methodological advances for synthesis of continuous and time-to-event outcomes would maximize use of evidence base. J Clin Epidemiol, 2020; 124: 94-105.

Manuscript available from publisher's website here.

Friday, May 1, 2020

Grey Matters: An Introduction to the Grey Literature and Where to Find It

Within the methods section of many a systematic review, it is common to come across the term "grey literature." Put plainly, grey literature comprises pieces of evidence that are not formally published in a book or peer-reviewed journal article.

Examples of grey literature that can be valuable to a systematic review include:

conference abstracts and proceedings
clinical study reports
dissertations and theses
journal preprints

Searching the "grey lit" has several important benefits:

It expands the reach of a systematic review beyond the scope of the databases mined by a search, increasing the chance of finding pieces of evidence that may be helpful to the final synthesis of data.
It helps reduce the impact of potential publication bias on the findings of a review.
It keeps the review current by including upcoming data from recent conferences, doctoral work, and other yet-to-be-published sources.

Ideally, a search of the grey literature should be used in tandem with other forms of hand-searching, including the searching of relevant citations within included articles and of well-known reviews on similar topic.

Where to Find Grey Literature

Below are some resources that list helpful links for exploring the grey literature: