Overview of Inference about Roc Curve in Medical Diagnosis

doi:10.15406/bbij.2014.01.00013

eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Editorial Volume 1 Issue 3

Overview of Inference about Roc Curve in Medical Diagnosis

Jingjing Yin

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Department of Biostatistics, Georgia Southern University, USA

Correspondence: Jingjing Yin, Department of Biostatistics, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Hendricks Hall 1007, P.O. Box 8015, Statesboro, GA 30460-8015, USA

Received: November 29, 2014 | Published: December 1, 2014

Citation: Yin J. Overview of Inference about Roc Curve in Medical Diagnosis.Biom Biostat Int J. 2014;1(3):61‒62. DOI: 10.15406/bbij.2014.01.00013

Download PDF

Editorial

Medical diagnosis aims to identify diseased individuals through the evaluation of the measurements of some biomarkers by performing a diagnostic test based on some biomarker measurements. Biomarkers are measured on either discrete or continuous scale and continuous biomarkers are utilized more often in medical practice. This article introduces the most popular tool for evaluating continuous biomarkers: the Receiver Operating Characteristic (ROC) curve.

For diagnostic tests with binary disease status, each subject is categorized as either healthy or diseased. A perfectly accurate diagnostic test would identify all truly diseased individuals as diseased and healthy individuals as non-diseased. However, such scenarios rarely happen since mostly the diseased and healthy population distributions overlap. There are two types of diagnostic errors: false negative (FN) which happens when classifying a diseased individual as healthy and false positive (FP) which happens when classifying a healthy individual as diseased. The case correctly identifying a diseased subject as diseased is called true positive (TP) and the case correctly identifying a healthy subject as non-diseased is called true negative (TN). The proportion/rate of true positives (TPR) is commonly referred as “sensitivity” and the proportion/rate of true negatives (TNR) as “specificity”. Sensitivity and specificity characterize the diagnostic accuracy under diseased and healthy population, respectively.

In order to construct a diagnostic test based on continuous biomarkers for binary disease status, a diagnostic threshold is needed. At the pre-specified diagnostic threshold value, paired values of sensitivity and specificity are computed to evaluate the test performance. As the threshold value decreases, sensitivity increases while specificity decreases. Therefore, a compromise between sensitivity and specificity is necessary to assess the test discriminatory accuracy. One popular way to evaluate the test performance over all possible threshold values is done by a graphical summary of the diagnostic accuracy, i.e. by plotting the pair of (1-specificity, sensitivity) for all possible threshold values to form a curve. This curve is known as the Receiver Operating Characteristic (ROC) curve. The ROC curve and its associated summary statistics are very useful in diagnostic field for the purpose of evaluating the discriminatory ability of biomarkers/diagnostic tests with continuous measurements. Extensive statistical research has been done in this field. There are reviews of statistical methods involving ROC curves.^1–4

There are two types of expressions for ROC curve: a point set or a curve. The ROC curve can be viewed as a point set of sensitivity and false positive rate given a diagnostic threshold value. Alternatively the ROC curve can be revised as a curve function of given values of false positive rate (i.e. 1-specificity). Generally, the second expression is used more often and it is equivalent as regarding sensitivity as a function of 1-specificity/false positive rate. Therefore, the confidence interval (CI) for the ROC curve is the same as CI of sensitivity at a given value of specificity.^5–9 Other situations require making inference on the whole ROC curve or partial ROC curve, i.e., most cases is more concerned with a range of high specificity (e.g. 80% to 95%). Likewise, it is also of interest to construct the confidence band (CB) for a portion of the ROC curve given a range of specificity or for the whole ROC curve.^10–15 The CI of ROC curve are diﬀerent from CB as CI gives a likely interval range of sensitivity given a fixed value of specificity, while CB gives a curvy strip area that covers the whole ROC curve or partial ROC curve given a range of specificity, which maintains the type I error rate simultaneously for all values of specificity in the given range.

When considering the ROC curve as a point set of sensitivity and specificity and a value of diagnostic threshold is given or estimated, we can also construct the confidence region (CR) of sensitivity and specificity.^16–17 There might be some confusion between the CR and CI of the ROC curve: the CI of the ROC curve gives an interval range of possible values of sensitivity at a fixed value of specificity, while CR of (sensitivity, specificity) given a diagnostic threshold defines an elliptical area which is likely to cover the true values of (sensitivity, specificity). Similarly, an analogue of the CB for the ROC curve based on the CR of (sensitivity, specificity) would be a tube-like volume linking an infinite numbers of elliptical areas together, which maintain a specified type I error rate simultaneously for a given range of threshold values. Hence, for making inference about the whole or partial ROC curve, a confidence volume around the sample ROC curve is an alternative to the CB of the ROC curve.