% Spectral Indices of Mammogram Images Predictive of BC
%
% The collection of digitized mammograms was obtained from the University of South
% Florida's Digital Database for Screening Mammography (DDSM).
% Images from this database are coupled with cancer status verified through biopsy.
% For every image a slope of wavelet spectra was calculated (Hamilton et al., 2011), and
% corresponding cancer status recorded.
% Only the craniocaudal projection images were used: the right breast image for all normal cases,
% and the cancerous breast (right or left) image for cancer cases.
% There were 105 normal (benign) cases, and 72 cancer cases considered.
%
% The data set {\tt sslopesstatus.dat} contains two columns: slope of the spectra and
% BC status. The goal is to propose and evaluate a test for BC based only on the slope
% of mammogram wavelet spectra.
%
% (a) Find AUC. How would you grade this test?
%
% (b) Find Youden Index (YI - maximal distance of ROC from the 45$^\circ$ line).
%
% (c) What threshold for the slope would you suggest so that mammograms with slopes
% exceeding this threshold are considered positive for BC. Assume that the errors
% of misclassification are equally bad.
%
% (d) What are the sensitivity/specificity of the test at the threshold suggested in (c)?
%
% Hamilton, E. K., Jeon, S., Ramirez Cobo, P., Lee, K. S.,
% and Vidakovic, B. (2011). Diagnostic Classification of Digital
% Mammograms by Wavelet-Based Spectral Tools: A Comparative Study.
% Proceedings of 2011 IEEE International Conference on Bioinformatics
% and Biomedicine (BIBM), 11/12-15/2011, Atlanta GA
%===========================================
clear all
close all
lw=2;
load 'C:\BESTAT\ROC\ROCdat\slopesmammo.dat'
% 2.1244525e+000 1.0000000e+000
% 1.6462179e+000 1.0000000e+000
% 1.7326839e+000 1.0000000e+000
% ...
%slopesmammo(:,1); % the spectral slopes for mammogram images
%slopesmammo(:,2); % cancer status 0 - absent; 1 - present
[d4 i4] = sort(slopesmammo(:,1));
%slopesmammo(:,1) are increasingly ordered as d4, i4 are original indices.
sslopesstatus = [ d4 slopesmammo(i4,2) 1-slopesmammo(i4,2)];
% d4 (slopes ordered incereasingly)
% slopesmammo(i4,2) (indicator of cases)
% 1-slopesmammo(i4,2) (indicator of noncases)
cumultruepos = cumsum(sslopesstatus(:,2)); %counting/summing 1's in cases
cumulfalsepos = cumsum(sslopesstatus(:,3)); %counting/summing 1's in controls
totalcancer = cumultruepos(end);
totalcontrol = cumulfalsepos(end);
% these are true positives/false positives if the
% threshold level is from the sequence sslopestatus(:,1)=d4
seth = cumultruepos/totalcancer; %sensitivity at the threshold set by d4
cspth = cumulfalsepos/totalcontrol; %1-specificity at the threshold set by d4
figure(1)
%ROC is a plot of sensitivity against (1-specificity)
plot(cspth,seth,'k-','linewidth',lw)
hold on
fill([cspth' 1 1 0],[seth' 1 0 0],'y')
plot([0 1],[0 1],'k--','LineWidth', lw)
xlabel('1 - Specificity')
ylabel('Sensitivity')
text('Interpreter','latex',...
'String','$\fbox{Area = 0.8820}$',...
'Position',[0.20 .65],...
'FontSize',17)
%add (0,0) and (1,1) to ROC
auc([0 cspth' 1],[0 seth' 1]) %0.8820
figure(2)
plot(sslopesstatus(:,1),(seth-cspth)/sqrt(2),'linewidth',lw)
hold on
plot(1.9520,0,'o','LineWidth',2,...
'MarkerEdgeColor','k',...
'MarkerFaceColor','g',...
'MarkerSize',10)
text(1.9520,0.02,'Slope=1.9520')
xlabel('Slope')
ylabel('Youden index')
hold off
%youden index
yi = max((seth-cspth)/sqrt(2)) %0.4509
%Slope corresponding to YI
sslopesstatus((seth-cspth)/sqrt(2)== yi , 1) %1.9521
% sensitivity/specificity at YI
seth((seth-cspth)/sqrt(2)== yi) %0.8472
1 - cspth((seth-cspth)/sqrt(2)== yi) %0.7905