+44 (0)1509 263171
Please use this identifier to cite or link to this item:
|Title: ||Influencing clinicians and healthcare managers: can ROC be more persuasive?|
|Authors: ||Taylor-Phillips, S.|
Wallis, Matthew G.
Gale, Alastair G.
|Keywords: ||ROC methodology|
Observer performance evaluation
|Issue Date: ||2010|
|Publisher: ||© 2010 Society of Photo-Optical Instrumentation Engineers|
|Citation: ||TAYLOR-PHILLIPS, S. ... et al., 2010. Influencing clinicians and healthcare managers: can ROC be more persuasive? IN: Medical Imaging 2010: Image Perception, Observer Performance, and Technology Assessment, edited by David J. Manning, Craig K. Abbey, Proc. SPIE 7627,76270X (2010).|
|Abstract: ||Receiver Operating Characteristic analysis provides a reliable and cost effective performance measurement tool, without
using full clinical trials. However, when ROC analysis shows that performance is statistically superior in one condition
than another it is difficult to relate this result to effects in practice, or even to determine whether it is clinically
significant. In this paper we present two concurrent analyses: using ROC methods alongside single threshold recall rate
data, and suggest that reporting both provides complimentary data. Four mammographers read 160 difficult cases (41%
malignant) twice, with and without prior mammograms. Lesion location and probability of malignancy was reported for
each case and analyzed using JAFROC. Concurrently each participant chose recall or return to screen for each case.
JAFROC analysis showed that the presence of prior mammograms improved performance (p<.05). Single threshold data
showed a trend towards a 26% increase in the number of false positive recalls without prior mammograms (p=.056). If
this trend were present throughout the NHS Breast Screening Programme then discarding prior mammograms would
correspond to an increase in recall rate from 4.6% to 5.3%, and 12,414 extra women recalled annually for assessment.
Whilst ROC methods account for all possible thresholds of recall and have higher power, providing a single threshold
example of false positive, false negative, and recall rates when reporting results could be more influential for clinicians.
This paper discusses whether this is a useful additional method of presenting data, or whether it is misleading and
|Description: ||Copyright 2010 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic electronic or print reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited. This paper can also be found at: http://dx.doi.org/10.1117/12.843784|
|Appears in Collections:||Conference Papers and Presentations (Computer Science)|
Files associated with this item: