OVERVIEW OF DIAGNOSTIC TESTS

@ Biostatistics & Hands on Practices on Medical Data Using SPSS, ILBS

SUMAN KUMAR

18 August 2017

Example dataset

Acute Subarachnoid Hemorrhage

gos6 outcome gender age wfns s100b ndka
1 5 Good Female 42 1 -2.040 1.102
2 5 Good Female 37 1 -1.966 2.145
3 5 Good Female 42 1 -2.303 2.091
4 5 Good Female 27 1 -3.219 2.344
5 1 Poor Female 42 3 -2.040 2.856
6 1 Poor Male 48 2 -2.303 2.546

Overview of Diagnostic Tests

Classes

Diagnostic Tests as Classifiers

Diagnostic Tests change the Probability Distribution


This is Bayes Theorem. Crux of all diagnostic tests

Chain of Diagnostic Tests

In real life, we apply a chain of diagnostic tests to assign a particular class to a patient.

Stages of development of Diagnostic Tests

Diagnostic Test Performance

Description

s100b outcome
1 -2.040 Good
2 -1.966 Good
3 -2.303 Good
4 -3.219 Good
5 -2.040 Poor
6 -2.303 Poor
7 -0.755 Good
8 -1.833 Poor
9 -1.715 Good
10 -2.303 Good

Graphical Representation

Higher values of s100b are associated with Poor class

Decision boundary

Mistakes are committed by the diagnostic test

How to choose best decision boundary (Cut Off)?

Let us take cut off as -1

pred_outcome Good Poor
1 Pred_Good 63 24
2 Pred_Poor 9 17

Understanding Crosstabulation

Combining performances across both classes

Relevance of Likelihood Ratios



Posttest Odds of being in Class 1 given test positive for Class 1 = Pretest Odds of being in Class 1 x Positive LR

Posttest Odds of being in Class 1 given test negative for Class 1 = Pretest Odds of being in Class 1 x Negative LR

Combining performance of Diagnostic Test across multiple Cut Offs

Receiver Operator Characteristic (ROC)

cut_offs sensitivity specificity
1 -3.3627 0.9756 0
2 -3.1073 0.9756 0.0694
3 -2.9046 0.9756 0.1111
4 -2.1638 0.7561 0.5417
5 -2.0802 0.7317 0.5417
6 -2.0032 0.6829 0.5833
7 -1.0087 0.4146 0.875
8 -0.9296 0.4146 0.8889
9 -0.8678 0.3902 0.8889
10 -0.0958 0.0488 1
11 0.3434 0.0244 1

Uses of ROC: Area under ROC

Meaning of AUROC

outcome predicted_prob
1 Good 0.2847
2 Good 0.3019
3 Good 0.2288
4 Good 0.0961
5 Poor 0.2847
6 Poor 0.2288
7 Good 0.6267
8 Poor 0.3343
9 Good 0.3643
10 Good 0.2288

Uses of ROC: Finding Cut offs

ROC curves should never be used to find cut offs

Comparing two diagnostic tests

Comparing diagnostic performance when cut offs are predetermined

var sensitivity specificity pos_lr neg_lr
1 s100b 0.4146 0.875 3.3171 0.669
2 ndka 0.6098 0.5556 1.372 0.7024

Comparing overall diagnostic performance

## 
##  DeLong's test for two correlated ROC curves
## 
## data:  roc_s100b and roc_ndka
## Z = 1.3908, p-value = 0.1643
## alternative hypothesis: true difference in AUC is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.7313686   0.6119580

Reliability Analysis

Concept

Categorical Measurement

Continuous Measurement: Intraclass Correlation

Decomposition of unbiased measurement

\[ Val_{obs} = Val_{true} + Error \]

Decomposition of Error

\[ Error = Error_{rater} + Error_{instrument} + Error_{unexplainable} \]

Decomposition of Variance

\[ Var_{total} = Var_{between-subject} + Var_{between-rater} + Var_{rest} \]

Definition of ICC

\[ ICC = Var_{between-subject}/(Var_{between-subject} + Var_{between-rater} + Var_{rest}) \]

rater1 rater2 rater3
1 3 3 2
2 3 6 1
3 3 4 4
4 4 6 4
5 5 2 3
6 5 4 2
7 2 2 1
8 3 4 6
9 5 3 1
10 2 3 1

Calculating ICC

##  Single Score Intraclass Correlation
## 
##    Model: twoway 
##    Type : agreement 
## 
##    Subjects = 20 
##      Raters = 3 
##    ICC(A,1) = 0.198
## 
##  F-Test, H0: r0 = 0 ; H1: r0 > 0 
##  F(19,39.7) = 1.83 , p = 0.0543 
## 
##  95%-Confidence Interval for ICC Population Values:
##   -0.039 < ICC < 0.494

Continuous Measurement: 2 measurers, difference in measurements

old new
1 1.1019 1.1015
2 2.1448 2.2711
3 2.0906 2.1748
4 2.3437 2.6211
5 2.8565 2.9799
6 2.5455 2.7093
7 1.7918 1.7754
8 2.5802 2.8231
9 2.7434 2.6784
10 1.7934 1.8215

Incorrect use of correlation coefficient

## 
##  Pearson's product-moment correlation
## 
## data:  ndka_df$old and ndka_df$new
## t = 36.283, df = 111, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9428688 0.9725325
## sample estimates:
##       cor 
## 0.9603319

Linearly correlated does not mean presence of agreement

Bland and Altman Method: Analysis of Difference

Bland and Altman Method: Analysis of Difference (contd)

## 
##  Shapiro-Wilk normality test
## 
## data:  ba_df$diffs
## W = 0.98909, p-value = 0.5018

Bland and Altman Method: Analysis of Difference (contd)

Bland and Altman Method: Analysis of Difference (contd)

Thank you