Recently, we reviewed a paper describing the methods by which the evidence of downstream consequences of screening can be linked to evidence of test accuracy via formal and informal modeling. The resulting judgment of the certainty of this evidence will communicate our certainty that a test’s true accuracy lies within a given range. A new paper published earlier this year provides guidance on evaluating the certainty of evidence for diagnostic accuracy.
Ranges for determining the certainty of evidence of test accuracy may be either fully or partially contextualized (meaning the range takes into account some or all of the possible effects of a test strategy, and is based on a value judgment of the relative importance of outcomes) or non-contextualized (meaning the range only takes into account the accuracy of the test without consideration of the relative implications of false positive or negatives).
Non-contextualized judgments assume that outside of differences in accuracy, everything else about two test strategies will have the same impact on outcomes; thus, certainty of evidence is judged based solely on the accuracy data. Contextualized judgments, on the other hand, also take into account the downstream consequences of a test’s accuracy – particularly the potential effects of false positives or negatives. Typically, non-contextualized or partially contextualized ratings are used in systematic reviews or health technology assessments (HTAs), whereas fully contextualized ratings should be used in the formation of guideline recommendations.
Sources of ranges for test accuracy with varying levels of contextualization include:
· Non-contextualized (systematic review or HTA)
o Confidence interval: certainty that the true sensitivity or specificity lies within the confidence interval(s) of the tests
- Does not take precision into account
o Direction of effect: certainty that there is a true difference between the sensitivity and specificity of two test strategies
- Requires a determination of what would make a meaningful difference in accuracy
· Partially contextualized (systematic review or HTA)
o Specified magnitude: determines whether a difference in accuracy between tests is trivial, small, moderate, or large.
- The acceptable magnitude of difference will be based at least partially on the importance of the downstream consequences of false positives and negatives
Example of a partly contextualized diagram of downstream consequences of screening of cervical dysplasia using a screen-treat strategy. |
· Fully contextualized (guideline recommendations)
o Rates the certainty of a test’s sensitivity and specificity based on whether the overall balance between benefits and harms would differ from one end of the range to the other.
- Ranges are determined by first considering all important and critical downstream consequences of testing.
Hultcrantz M, Mustafa RA, Leeflang MMG, Lavergne V, Estrada-Orozco K, Ansari MT, Izcovich A et al. Defining ranges for certainty ratings of diagnostic accuracy: A GRADE concept paper. J Clin Epidemiol 117 (138-148).
Manuscript available here on publisher's site.