Statistics from Altmetric.com
Editor,—Dr Collinson suggests that it is time that cardiologists use the ROC (receiver operator characteristic) curve and that it “avoids the pitfalls of sensitivity and specificity”.1 While the ROC curve is undoubtedly useful in describing the performance of a test and in comparing tests, I find the claim a little surprising as the ROC curve is simply a series of sensitivities and specificities with the cut off sweeping from minimum sensitivity to minimum specificity.
Second, it is recommended that the “point of maximum curvature” is chosen as the optimum trade off between sensitivity and specificity. This is true if the costs of false positives and false negatives are equal—but only if these are equal, which is by no means always the case. Next the point of maximum curvature needs to be judged: in Collinson’s fig 1 (for creatine kinase (CK) isoenzyme MB) the curve turns quite sharply at approximately (0.05, 0.87) and again at (0.17, 0.98) but between the two points the slope is fairly constant. The closest the ROC curve gets to (0, 1; the top left hand corner) is approximately (0.15, 0.95). Depending on the relative importance for clinical decision making of sensitivity and specificity, one could choose between these three points. These then need converting back via table 2 to CKMB cut offs of approximately 12 (sensitivity more important), 16 (sensitivity and specificity of equal importance), and 26 (specificity more important). For myoglobin the range of optima (Collinson’s fig 2) is wide, from approximately (0.12, 0.64) at the first shoulder to (0.55, 0.94) at the second.
Finally, as the ROC curve is sensitivity–specificity (or a series of sensitivities and specificities), it is difficult to see how it “minimises the prevalence problem”. Sensitivity and specificity are features of a test (and the ROC curve helps in the choice of the cut off) but predictive values (positive and negative) depend on prevalence.
This letter was shown to the authors, who reply as follows:
Dr West has read my article with a distinct lack of enthusiasm for the ROC curve. Clearly he prefers sensitivity and specificity and regards my brief (and illustrative) article as the definitive statement on the subject. This, while flattering, is clearly not the case and deserves some comment.
His opening statement misquotes the last paragraph where I have said “largely avoids the pitfalls of sensitivity and specificity”. If Dr West is of the view that a single sensitivity and specificity calculation is better than ROC then I must disagree. ROC is much better than a single sensitivity and specificity calculation, which can be arbitrarily selected to maximise one (apparently) desirable threshold largely for the reasons he illustrates.
With regard to the second paragraph Dr West makes some excellent points, which well illustrate the sensitivity and specificity problem. There is a need for caution in his interpretation of a dataset chosen to illustrate what a ROC curve is and how it is derived. The issue of the “cost” of false positive versus false negatives is of great significance to any clinical diagnostic tests, but in routine clinical practice in real patient groups (as opposed to population based studies) the objective is to maximise both sensitivity and specificity for individual patient diagnosis. The points that he raises are more fully discussed in the excellent review paper by Hendersen.1-1
In respect of his final point I would reiterate the last paragraph of the article, ROC curves are better than single sensitivity–specificity calculations but cannot abolish the prevalence problem. In that I concur with Dr West.