clear all
close all
disp('PBC Example')
lw = 3;
set(0, 'DefaultAxesFontSize', 16);
fs = 15;
msize = 10;
pbc=xlsread ('C:\Springer\Survival\Survivaldat\pbc.xls');
% load('C:\Springer\Survival\Survivaldat\pbcdata.dat')
casen = pbc(:,1); %case number 1-312
lived = pbc(:,2); %days lived (from reguistration to study date)
indicatord = pbc(:,3); %0 censored, 1 death
treatment = pbc(:,4); %1 - D-Penicillamine, 2 - Placebo
age = pbc(:,5); %age in years
gender = pbc(:,6); %0 male, 1 female
ascites= pbc(:,7); %0 no, 1 yes
hepatomegaly=pbc(:,8); %0 no, 1 yes
spiders = pbc(:,9); %0 no, 1 yes
edema = pbc(:,10); %0 no, 0.5 present/no terapy, 1 present/terapy given
bilirubin = pbc(:,11); %bilirubin [mg/dl]
cholesterol = pbc(:,12); %cholesterol [mg/dl]
albumin = pbc(:,13); %albumin [gm/dl]
ucopper =pbc(:,14); %urine copper [mg/day]
aphosp =pbc(:,15); %alcaline phosphatase [U/liter]
sgot = pbc(:,16); %SGOT [U/ml]
trig =pbc(:,17); %triglycerides [mg/dl]
platelet = pbc(:,18); %# platelet count [#/mm^3]/1000
prothro = pbc(:,19); %prothrombin time [sec]
histage = pbc(:,20); %hystologic stage [1,2,3,4]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%To illustrate CPH model, we selected 5 predictors and
%formed a design matrix $X$ as
X = [treatment age gender edema];
%lived is the life-time, censoring vector is 1-1-indicatord,
%and baseline is taken to be hazard when all covariates are 0.
[b,logL,H,stats] = coxphfit(X,lived,...
'censoring',1-indicatord,'baseline',0);
% H is a 2-column matrix that contains cummulative hazard estimate.
% The first column of H containins y values and the second column contains
% the estimated baseline cumulative hazard evaluated at y.
% We also selected two subjects from the study to
% calculate survival curves corresponding to their
% covariates.
X(100,:) %2.0000 51.4689 0 0
% placebo, male 51 y.o., no edema
X(275,:) %1.0000 38.3162 1.0000 0
% teatment D-Penicillamine, female 38 y.o., no edema
% First we find a cummulative hazards at the mean values
% of predictots as well as for subjects #100 and #175.
Hmean(:,2) = H(:,2) .* exp(mean(X)*b); %mean c haz
Hsubj100(:,2) = H(:,2) .* exp(X(100,:)*b); %subject #100
Hsubj275(:,2) = H(:,2) .* exp(X(275,:)*b); %subject #275
b
% 0.0831
% 0.0324
% -0.3940
% 2.2424
%
% Note that treatment coefficient 0.0831 indicates that
% given all other covariates fixed, placebo
% increases risk over the treatment.
% Note that age and edema also increase the risk,
% while the risk for female subjects is smaller.
%Next, we find survival functions
%from cummulative hazards
Smean = exp(-Hmean(:,2));
Ssubj100 = exp(-Hsubj100(:,2));
Ssubj275 = exp(-Hsubj275(:,2));
%
stairs(H(:,1),Smean,'b-','linewidth',2)
hold on
stairs(H(:,1),Ssubj100,'k-','linewidth',2)
stairs(H(:,1),Ssubj275,'r-','linewidth',2)
xlabel('$t$ (days)','Interpreter','LaTeX')
ylabel('$\hat S(t)$','Interpreter','LaTeX')
legend('average subject','subject #100', 'subject #275', 3)
axis tight
print -depsc 'C:\Springer\Survival\Survivaleps\CoxPBC.eps'
% H is a 2-column matrix that contains cummulative hazard estimate.
% The first column of H containins y values and the second column contains
% the estimated baseline cumulative hazard evaluated at y.
% The variables in the PBC-data set are part of the Mayo Clinic trial
% in primary biliary cirrhosis (PBC) of the liver conducted between 1974 and
% 1984. A total of 424 PBC patients, referred to Mayo Clinic during
% that ten-year interval, met eligibility criteria for the randomized placebo
% controlled trial of the drug D-penicillamine. The 312 cases in the data
% set participated in the randomized trial.
% Variable Description
% ___________________________________________________________________________
%
% N Case number.
% X The number of days between registration and the earlier of
% death, liver transplantation, or study analysis time in July, 1986.
% D 1 if X is time to death, 0 if time to censoring
% Z1 Treatment Code, 1 = D-penicillamine, 2 = placebo.
% Z2 Age in years. It was calculated by
% dividing the number of days between birth and study registration by 365.
% Z3 Sex, 0 = male, 1 = female.
% Z4 Presence of ascites, 0 = no, 1 = yes.
% Z5 Presence of hepatomegaly, 0 = no, 1 = yes.
% Z6 Presence of spiders 0 = no, 1 = Yes.
% Z7 Presence of edema, 0 = no edema and no diuretic therapy for
% edema; 0.5 = edema present for which no diuretic therapy was given, or
% edema resolved with diuretic therapy; 1 = edema despite diuretic therapy
% Z8 Serum bilirubin, in mg/dl.
% Z9 Serum cholesterol, in mg/dl.
% Z10 Albumin, in gm/dl.
% Z11 Urine copper, in mg/day.
% Z12 Alkaline phosphatase, in U/liter.
% Z13 SGOT, in U/ml.
% Z14 Triglycerides, in mg/dl.
% Z15 Platelet count; coded value is number of platelets
% per-cubic-milliliter of blood divided by 1000.
% Z16 Prothrombin time, in seconds.
% Z17 Histologic stage of disease, graded 1, 2, 3, or 4.
% _____________________________________________________________________________
%
%
% STORY BEHIND THE DATA:
%
% Between January, 1974 and May, 1984, the Mayo Clinic conducted a
% double-blinded randomized trial in primary biliary cirrhosis of the liver
% (PBC), comparing the drug D-penicillamine (DPCA) with a placebo. There
% were 424 patients who met the eligibility criteria seen at the Clinic while
% the trial was open for patient registration. Both the treating physician and
% the patient agreed to participate in the randomized trial in 312 of the 424
% cases. The date of randomization and a large number of clinical, biochemical,
% serologic, and histologic parameters were recorded for each of the 312
% clinical trial patients. The data from the trial were analyzed in 1986 for
% presentation in the clinical literature. For that analysis, disease and
% survival status as of July, 1986, were recorded for as many patients as
% possible. By that date, 125 of the 312 patients had died, with only 11
% not attributable to PBC. Eight patients had been lost to follow up, and 19
% had undergone liver transplantation.
%
% PBC is a rare but fatal chronic liver disease of unknown cause,
% with a prevalence of about 50-cases-per-million population. The primary
% pathologic event appears to be the destruction of interlobular bile ducts,
% which may be mediated by immunologic mechanisms. The data discussed here are
% important in two respects. First, controlled clinical trials are difficult to
% complete in rare diseases, and this case series of patients uniformly
% diagnosed, treated, and followed is the largest existing for PBC. The
% treatment comparison in this trial is more precise than in similar trials
% having fewer participants and avoids the bias that may arise in comparing
% a case series to historical controls. Second, the data present an
% opportunity to study the natural history of the disease. We will see that,
% despite the immunosuppressive properties of DPCA, there are no detectable
% differences between the distributions of survival times for the DPCA and
% placebo treatment groups. This suggests that these groups can be combined
% in studying the association between survival time from randomization and
% clinical and other measurements. In the early to mid 1980s, the rate of
% successful liver transplant increased substantially, and transplant has
% become an effective therapy for PBC. The Mayo Clinic data set is therefore
% one of the last allowing a study of the natural history of PBC in patients
% who were treated with only supportive care or its equivalent. The PBC data
% can be used to: estimate a survival distribution; test for differences
% between two groups; and estimate covariate effects via a regression
% model.
%