An example of analysing time-to-event data using flexible parametric excess hazard model. The R-package mexhaz

This is an example of analysis that can be done using the R-package mexhaz. This package allows to fit flexible hazard model in the framework of overall survival or net survival (i.e. excess hazard), with/wihout time-dependent effects for covariates, with/without non-linear effect and with/without random effect defined at the cluster level.

Start

The first step is to install the mexhaz package from the CRAN website. It needs to be done only the first time you’re using mexhaz, as the next time, mexhaz will be automatically pre-installed in your R session. To be able to use mexhaz in your R session, you need also to load the package by using the library() function.

Load the package

library(mexhaz)
## Le chargement a nécessité le package : survival
sessionInfo()
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   
## [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
## [5] LC_TIME=French_France.1252    
## 
## attached base packages:
## [1] methods   stats     graphics  grDevices utils     datasets  base     
## 
## other attached packages:
## [1] mexhaz_1.3      survival_2.41-3
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.13      bookdown_0.6      lattice_0.20-35  
##  [4] digest_0.6.12     rprojroot_1.2     MASS_7.3-47      
##  [7] grid_3.4.3        backports_1.1.1   magrittr_1.5     
## [10] evaluate_0.10.1   blogdown_0.5      stringi_1.1.5    
## [13] Matrix_1.2-12     rmarkdown_1.6     splines_3.4.3    
## [16] statmod_1.4.30    tools_3.4.3       stringr_1.2.0    
## [19] numDeriv_2016.8-1 xfun_0.1          yaml_2.1.14      
## [22] compiler_3.4.3    htmltools_0.3.6   knitr_1.18
packageVersion("mexhaz")
## [1] '1.3'
oldPar <-par()

Load the data

For this illustration, we will use the “simulated” data of colon cancer patients diagnosed in England between 2000 and 2002, and follow-up to the 31/12/2007.

  • sex: 1 for man, 2 for woman
  • site: localisation of the cancer (Topography ICD-10)
  • cancer: cancer site
  • diagmdy: date of diagnosis
  • birthmdy: date of birth
  • dead: vital Status (0=Alive, 1=Death)
  • finmdy: date of last know vital status
  • ftime: survival Time in days (Follow-up maximum 7 years)
  • stage: stage at diagnosis
  • dep: deprivation group (5 categories)
  • ageout: age at the last know vital status (integer)
  • agediag: age at diagnosis (continuous)
  • year: year at the last know vital status
  • gor: Government Office Region where the patient lived at the date of diagnosis
  • id: identification number
  • agegrp: age at diagnosis in 5 categories
  • rate: Population (Expected) mortality rate at the time of the last known vital status

We start to do some data management to work on complete data, and to create some useful variables

mydata <- read.delim(file="C:/Users/Aurélien/Dropbox/WorkStat/Teaching/Granada/Modelling/Practical/fakecolon_data.dat", 
                     header = TRUE, sep = "\t", quote = "\"", dec = ".", fill = TRUE, comment.char = "")

temp <- mydata
temp <- temp[temp$dep!=9,]
temp <- temp[temp$sex==1,]

temp$timey <- temp$ftime/365.25

summary(temp)
##       sex         site             cancer           diagmdy     
##  Min.   :1   C187   :6547   Colorectal:17867   10apr2002:   32  
##  1st Qu.:1   C180   :3827                      17apr2001:   32  
##  Median :1   C189   :2177                      08dec2000:   29  
##  Mean   :1   C182   :1788                      23sep2002:   29  
##  3rd Qu.:1   C184   :1188                      03mar2002:   28  
##  Max.   :1   C186   : 818                      23may2002:   28  
##              (Other):1522                      (Other)  :17689  
##       birthmdy          dead              finmdy          ftime     
##  20jun1925:    8   Min.   :0.0000   31dec2007: 3484   Min.   :   1  
##  03may1932:    7   1st Qu.:0.0000   23nov2007:  101   1st Qu.: 322  
##  05mar1927:    7   Median :1.0000   16nov2007:   98   Median :1303  
##  07aug1930:    7   Mean   :0.6141   19dec2007:   93   Mean   :1325  
##  11jul1930:    7   3rd Qu.:1.0000   24nov2007:   92   3rd Qu.:2232  
##  01jan1922:    6   Max.   :1.0000   11dec2007:   91   Max.   :2920  
##  (Other)  :17825                    (Other)  :13908                 
##  stage         dep           ageout         agediag           year     
##  A:1738   Min.   :1.00   Min.   :18.46   Min.   :16.10   Min.   :2000  
##  B:6959   1st Qu.:2.00   1st Qu.:68.14   1st Qu.:64.25   1st Qu.:2002  
##  C:6360   Median :3.00   Median :75.79   Median :72.14   Median :2005  
##  D:2810   Mean   :2.96   Mean   :74.35   Mean   :70.73   Mean   :2004  
##           3rd Qu.:4.00   3rd Qu.:81.91   3rd Qu.:78.51   3rd Qu.:2007  
##           Max.   :5.00   Max.   :99.00   Max.   :99.09   Max.   :2007  
##                                                                        
##  gor             id            agegrp           rate         
##  M:17867   Min.   :    1   Min.   :15.00   Min.   :0.000444  
##            1st Qu.: 8266   1st Qu.:55.00   1st Qu.:0.020526  
##            Median :15885   Median :65.00   Median :0.045229  
##            Mean   :16354   Mean   :65.37   Mean   :0.062192  
##            3rd Qu.:23997   3rd Qu.:75.00   3rd Qu.:0.085530  
##            Max.   :34731   Max.   :85.00   Max.   :0.559850  
##                                                              
##      timey         
##  Min.   :0.002738  
##  1st Qu.:0.881588  
##  Median :3.567420  
##  Mean   :3.628704  
##  3rd Qu.:6.110883  
##  Max.   :7.994524  
## 
temp$agegrp <- rep("[15;45[")
temp$agegrp <- ifelse(temp$agediag>=45 & temp$agediag<55, "[45;55[", temp$agegrp)
temp$agegrp <- ifelse(temp$agediag>=55 & temp$agediag<65, "[55;65[", temp$agegrp)
temp$agegrp <- ifelse(temp$agediag>=65 & temp$agediag<75, "[65;75[", temp$agegrp)
temp$agegrp <- ifelse(temp$agediag>=75  , "[75;++[", temp$agegrp)
temp$agediagtrunc <- trunc(temp$agediag)
table(temp$agediagtrunc, temp$agegrp)
##     
##      [15;45[ [45;55[ [55;65[ [65;75[ [75;++[
##   16       1       0       0       0       0
##   18       1       0       0       0       0
##   19       1       0       0       0       0
##   20       3       0       0       0       0
##   21       5       0       0       0       0
##   22       5       0       0       0       0
##   23       1       0       0       0       0
##   24       1       0       0       0       0
##   25       2       0       0       0       0
##   26       1       0       0       0       0
##   27       5       0       0       0       0
##   28       1       0       0       0       0
##   29       4       0       0       0       0
##   30       4       0       0       0       0
##   31      13       0       0       0       0
##   32      11       0       0       0       0
##   33      12       0       0       0       0
##   34      19       0       0       0       0
##   35      15       0       0       0       0
##   36      15       0       0       0       0
##   37      23       0       0       0       0
##   38      30       0       0       0       0
##   39      27       0       0       0       0
##   40      30       0       0       0       0
##   41      27       0       0       0       0
##   42      40       0       0       0       0
##   43      43       0       0       0       0
##   44      61       0       0       0       0
##   45       0      62       0       0       0
##   46       0      53       0       0       0
##   47       0      62       0       0       0
##   48       0      86       0       0       0
##   49       0      81       0       0       0
##   50       0     101       0       0       0
##   51       0     135       0       0       0
##   52       0     160       0       0       0
##   53       0     177       0       0       0
##   54       0     214       0       0       0
##   55       0       0     228       0       0
##   56       0       0     254       0       0
##   57       0       0     247       0       0
##   58       0       0     300       0       0
##   59       0       0     293       0       0
##   60       0       0     321       0       0
##   61       0       0     358       0       0
##   62       0       0     370       0       0
##   63       0       0     449       0       0
##   64       0       0     468       0       0
##   65       0       0       0     501       0
##   66       0       0       0     530       0
##   67       0       0       0     532       0
##   68       0       0       0     516       0
##   69       0       0       0     615       0
##   70       0       0       0     663       0
##   71       0       0       0     672       0
##   72       0       0       0     664       0
##   73       0       0       0     712       0
##   74       0       0       0     677       0
##   75       0       0       0       0     730
##   76       0       0       0       0     765
##   77       0       0       0       0     692
##   78       0       0       0       0     643
##   79       0       0       0       0     646
##   80       0       0       0       0     630
##   81       0       0       0       0     548
##   82       0       0       0       0     422
##   83       0       0       0       0     316
##   84       0       0       0       0     315
##   85       0       0       0       0     300
##   86       0       0       0       0     234
##   87       0       0       0       0     206
##   88       0       0       0       0     166
##   89       0       0       0       0     114
##   90       0       0       0       0      71
##   91       0       0       0       0      59
##   92       0       0       0       0      35
##   93       0       0       0       0      28
##   94       0       0       0       0      23
##   95       0       0       0       0      11
##   96       0       0       0       0       6
##   98       0       0       0       0       4
##   99       0       0       0       0       1
quantile(temp$agediag, na.rm=T, probs = seq(0,1,0.1))
##       0%      10%      20%      30%      40%      50%      60%      70% 
## 16.10404 56.14565 62.11964 66.08515 69.41656 72.13689 74.72690 77.17509 
##      80%      90%     100% 
## 79.86311 83.32211 99.08556
temp$Iagegrp1545 <- ifelse(temp$agegrp=="[15;45[",1,0) 
temp$Iagegrp4555 <- ifelse(temp$agegrp=="[45;55[",1,0) 
temp$Iagegrp5565 <- ifelse(temp$agegrp=="[55;65[",1,0) 
temp$Iagegrp6575 <- ifelse(temp$agegrp=="[65;75[",1,0) 
temp$Iagegrp75pp <- ifelse(temp$agegrp=="[75;++[",1,0) 
# table(temp$Iagegrp1545, temp$agegrp)
# table(temp$Iagegrp4555, temp$agegrp)
# table(temp$Iagegrp5565, temp$agegrp)
# table(temp$Iagegrp6575, temp$agegrp)
# table(temp$Iagegrp75pp, temp$agegrp)

temp$agediagc=temp$agediag-70
temp$agediagc2 <- temp$agediagc^2 
temp$agediagc2plus <- (temp$agediagc-0)^2*(temp$agediagc>0)
temp$agediagc3 <- temp$agediagc^3 
temp$agediagc3plus <- (temp$agediagc-0)^3*(temp$agediagc>0)

temp$IstageA <- ifelse(temp$stage=="A",1,0) # table(temp$IstageA,temp$Dukesstage)
temp$IstageB <- ifelse(temp$stage=="B",1,0) # table(temp$IstageB,temp$Dukesstage)
temp$IstageC <- ifelse(temp$stage=="C",1,0) # table(temp$IstageC,temp$Dukesstage)
temp$IstageD <- ifelse(temp$stage=="D",1,0) # table(temp$IstageD,temp$Dukesstage)

temp$Idep1 <- ifelse(temp$dep==1,1,0) # table(temp$Idep1, temp$dep)
temp$Idep2 <- ifelse(temp$dep==2,1,0) # table(temp$Idep2, temp$dep)
temp$Idep3 <- ifelse(temp$dep==3,1,0) # table(temp$Idep3, temp$dep)
temp$Idep4 <- ifelse(temp$dep==4,1,0) # table(temp$Idep4, temp$dep)
temp$Idep5 <- ifelse(temp$dep==5,1,0) # table(temp$Idep5, temp$dep)

summary(temp$agediag)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.10   64.25   72.14   70.73   78.51   99.09
summary(temp$timey)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.002738 0.881588 3.567420 3.628704 6.110883 7.994524
table(temp$sex)
## 
##     1 
## 17867
table(temp$dep, temp$stage)
##    
##        A    B    C    D
##   1  341 1361 1249  548
##   2  383 1455 1375  580
##   3  344 1476 1308  554
##   4  372 1426 1329  575
##   5  298 1241 1099  553

First Model: Age analysed as categorical variable

We fit a model M1bs assuming a cubic B-spline function for the (log of the) baseline excess hazard (using arguments base="exp.bs" and degree=3), with one knot located at 1 year (using argument knots to specify the location of the knot(s)). The model M1bs assumes linear and proportional (i.e. constant over time) effects of the covariables dep, stage and agegrp, the latter variable corresponding to the age at diagnosis categorised in age-group. We use the age-group [65-75[ as the reference for the variable agegrp. The results given by the mexhaz function are detailed below. The excess mortality hazard fitted is of the form λE(t;x)=λ0(t)exp(i=24αistagei+i=25βidepi+i=1,i45γiagecati). We also fit a model M1ns assuming a cubic restricted spline function for the (log of the) baseline excess hazard (using arguments base="exp.ns" and degree=3), with three interior knots located at 1, 3 and 5 years (using argument knots to specify the location of the knot(s)). The excess mortality hazard fitted is of the form λE(t;x)=λ0ns(t)exp(i=24αistagei+i=25βidepi+i=1,i45γiagecati).

Estimation

# M1overal <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+Iagegrp1545+Iagegrp4555+Iagegrp5565+Iagegrp75pp, 
#                 data=temp, base="exp.bs", degree=3, knots=c(1), verbose = 0)
# 
# initial <- round(M1overal$coeff,2)

M1bs <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+Iagegrp1545+Iagegrp4555+Iagegrp5565+Iagegrp75pp,
             data=temp, base="exp.bs", degree=3, knots=c(1), verbose = 500, expected="rate") # , init=initial)
##  Eval    LogLik Time
##     0 -25865.62 0.15
## Param
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0
## 
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 25865.62
## Gradient:
##  [1]  1250.65358    60.29418   712.41550   826.00875   448.79197
##  [6]   390.35978   280.70016   149.20958    26.07932  2255.47749
## [11]  -149.00131 -1707.73004    66.95293   144.76583   537.96909
## [16]   -95.52193
## 
##  Eval    LogLik Time
##   500 -21874.37   19
## Param
##  [1] -2.1240 -1.8421 -0.7866 -3.7884 -3.2131 -0.0051  0.0664  0.1178
##  [9]  0.2143  0.8243  1.8281  3.1658 -0.2450 -0.1038 -0.1264  0.3309
## 
## iteration = 54
## Parameter:
##  [1] -2.141865427 -1.860586763 -0.715698489 -3.963560050 -2.954318210
##  [6] -0.006557109  0.065265093  0.116906733  0.212574231  0.845157214
## [11]  1.849196579  3.187384801 -0.240507554 -0.102819529 -0.126464263
## [16]  0.331999008
## Function Value
## [1] 21874.03
## Gradient:
##  [1] -0.0091159006 -0.0067789757 -0.0005456968  0.0003432783  0.0003854315
##  [6] -0.0014479156 -0.0002983143 -0.0003019522 -0.0062645995 -0.0011496013
## [11] -0.0045110863 -0.0031307722 -0.0003383320  0.0020154403 -0.0032850949
## [16] -0.0010295480
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##    54  823 exp.bs     20      10   nlm    ---    1 -21874.03      31.09
summary(M1bs)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + Iagegrp1545 + Iagegrp4555 + 
##     Iagegrp5565 + Iagegrp75pp, data = temp, expected = "rate", 
##     base = "exp.bs", degree = 3, knots = c(1), verbose = 500)
## 
## Coefficients:
##               Estimate     StdErr  t.value   p.value    
## Intercept   -2.1418654  0.1130108 -18.9527 < 2.2e-16 ***
## BS3.1       -1.8605868  0.0631538 -29.4612 < 2.2e-16 ***
## BS3.2       -0.7156985  0.1269160  -5.6391 1.735e-08 ***
## BS3.3       -3.9635600  0.2549669 -15.5454 < 2.2e-16 ***
## BS3.4       -2.9543182  0.3248457  -9.0945 < 2.2e-16 ***
## Idep2       -0.0065571  0.0396616  -0.1653 0.8686889    
## Idep3        0.0652651  0.0398074   1.6395 0.1011220    
## Idep4        0.1169067  0.0394431   2.9639 0.0030413 ** 
## Idep5        0.2125742  0.0402629   5.2797 1.309e-07 ***
## IstageB      0.8451572  0.1076694   7.8496 4.410e-15 ***
## IstageC      1.8491966  0.1050998  17.5947 < 2.2e-16 ***
## IstageD      3.1873848  0.1056762  30.1618 < 2.2e-16 ***
## Iagegrp1545 -0.2405076  0.0777560  -3.0931 0.0019838 ** 
## Iagegrp4555 -0.1028195  0.0493388  -2.0839 0.0371790 *  
## Iagegrp5565 -0.1264643  0.0354330  -3.5691 0.0003591 ***
## Iagegrp75pp  0.3319990  0.0300190  11.0596 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                Coef      HR CI.lower CI.upper
## Idep2       -0.0066  0.9935   0.9192   1.0738
## Idep3        0.0653  1.0674   0.9873   1.1541
## Idep4        0.1169  1.1240   1.0404   1.2144
## Idep5        0.2126  1.2369   1.1430   1.3384
## IstageB      0.8452  2.3283   1.8854   2.8754
## IstageC      1.8492  6.3547   5.1716   7.8084
## IstageD      3.1874 24.2250  19.6927  29.8004
## Iagegrp1545 -0.2405  0.7862   0.6751   0.9157
## Iagegrp4555 -0.1028  0.9023   0.8191   0.9939
## Iagegrp5565 -0.1265  0.8812   0.8221   0.9446
## Iagegrp75pp  0.3320  1.3938   1.3141   1.4782
## 
## log-likelihood: -21874.0252 (for 16 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972
M1ns <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+Iagegrp1545+Iagegrp4555+Iagegrp5565+Iagegrp75pp,
               data=temp, base="exp.ns", knots=c(1,3,5), verbose = 500, expected="rate")
##  Eval    LogLik Time
##     0 -27264.32 0.13
## Param
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0
## 
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 27264.32
## Gradient:
##  [1]  5954.3076  1571.3816   206.2587  2198.5810 -1038.6589  1438.2777
##  [7]  1251.4660  1077.5460   803.3186  4501.5014  1471.5020 -1489.3667
## [13]   200.4369   504.4960  1587.9870  1336.9654
## 
##  Eval    LogLik Time
##   500 -22107.88 25.7
## Param
##  [1] -2.7389 -0.6098 -2.4879 -3.7999 -0.9576 -0.0091  0.0615  0.1208
##  [9]  0.2180  0.8776  1.9023  3.2589 -0.2135 -0.0873 -0.1187  0.3418
## 
##  Eval    LogLik  Time
##  1000 -22105.34 51.38
## Param
##  [1] -2.6885 -0.6693 -2.3127 -4.0833 -1.5447 -0.0090  0.0617  0.1156
##  [9]  0.2153  0.8374  1.8610  3.2164 -0.2439 -0.1036 -0.1283  0.3354
## 
## iteration = 79
## Parameter:
##  [1] -2.688662355 -0.669166739 -2.312592180 -4.083298423 -1.544859949
##  [6] -0.009051424  0.061689040  0.115538601  0.215265243  0.837568881
## [11]  1.861086727  3.216506282 -0.243785120 -0.103578347 -0.128345652
## [16]  0.335432560
## Function Value
## [1] 22105.34
## Gradient:
##  [1] -0.0065259112  0.0066356733  0.0032830958 -0.0014183784  0.0008406966
##  [6]  0.0153704605 -0.0178843038  0.0021500455 -0.0064173946 -0.0018590072
## [11] -0.0041401827  0.0000000000  0.0010513759  0.0006330083 -0.0040090526
## [16] -0.0015861588
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##    79 1164 exp.ns     20      10   nlm    ---    1 -22105.34      61.17
summary(M1ns)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + Iagegrp1545 + Iagegrp4555 + 
##     Iagegrp5565 + Iagegrp75pp, data = temp, expected = "rate", 
##     base = "exp.ns", knots = c(1, 3, 5), verbose = 500)
## 
## Coefficients:
##               Estimate     StdErr  t.value   p.value    
## Intercept   -2.6886624  0.1134329 -23.7027 < 2.2e-16 ***
## NS3.1       -0.6691667  0.0711504  -9.4050 < 2.2e-16 ***
## NS3.2       -2.3125922  0.1386946 -16.6740 < 2.2e-16 ***
## NS3.3       -4.0832984  0.1742215 -23.4374 < 2.2e-16 ***
## NS3.4       -1.5448599  0.2964812  -5.2107 1.903e-07 ***
## Idep2       -0.0090514  0.0398388  -0.2272  0.820270    
## Idep3        0.0616890  0.0400005   1.5422  0.123041    
## Idep4        0.1155386  0.0396297   2.9155  0.003556 ** 
## Idep5        0.2152652  0.0404320   5.3241 1.027e-07 ***
## IstageB      0.8375689  0.1097932   7.6286 2.492e-14 ***
## IstageC      1.8610867  0.1070805  17.3803 < 2.2e-16 ***
## IstageD      3.2165063  0.1076268  29.8857 < 2.2e-16 ***
## Iagegrp1545 -0.2437851  0.0778114  -3.1330  0.001733 ** 
## Iagegrp4555 -0.1035783  0.0494208  -2.0958  0.036110 *  
## Iagegrp5565 -0.1283457  0.0355444  -3.6109  0.000306 ***
## Iagegrp75pp  0.3354326  0.0301965  11.1083 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                Coef      HR CI.lower CI.upper
## Idep2       -0.0091  0.9910   0.9165   1.0715
## Idep3        0.0617  1.0636   0.9834   1.1504
## Idep4        0.1155  1.1225   1.0386   1.2131
## Idep5        0.2153  1.2402   1.1457   1.3425
## IstageB      0.8376  2.3107   1.8633   2.8656
## IstageC      1.8611  6.4307   5.2132   7.9326
## IstageD      3.2165 24.9408  20.1973  30.7985
## Iagegrp1545 -0.2438  0.7837   0.6728   0.9128
## Iagegrp4555 -0.1036  0.9016   0.8184   0.9933
## Iagegrp5565 -0.1283  0.8795   0.8204   0.9430
## Iagegrp75pp  0.3354  1.3985   1.3182   1.4838
## 
## log-likelihood: -22105.3382 (for 16 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972

Prediction

mytime <- seq(0.1,7,0.1)

predM1bsbase <- predict(M1bs, time.pts = mytime, 
                        data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                              IstageB=0, IstageC=0, IstageD=0, 
                                              Iagegrp1545=0, Iagegrp4555=0, Iagegrp5565=0, Iagegrp75pp=0))

predM1bsagegrp4555 <- predict(M1bs, time.pts = mytime, 
                              data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                    IstageB=0, IstageC=0, IstageD=0, 
                                                    Iagegrp1545=0, Iagegrp4555=1, Iagegrp5565=0, Iagegrp75pp=0))

predM1bsagegrp75pp <- predict(M1bs, time.pts = mytime, 
                              data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                    IstageB=0, IstageC=0, IstageD=0, 
                                                    Iagegrp1545=0, Iagegrp4555=0, Iagegrp5565=0, Iagegrp75pp=1))

# Plot of the baseline excess mortality hazard and the corresponding net survival 
par(mfrow=c(1,2), oma = c(0, 0, 2, 0))
plot(predM1bsbase, which="hazard", xlim=c(0,5),ylim=c(0,0.1), 
     conf.int = T, main="Baseline excess \n mortality hazard", lwd=2)
axis(2,at=seq(0,0.2,by=0.02), tck=1,lty=8, lwd=0.1, labels=F, col="grey")

plot(predM1bsbase, which="surv", xlim=c(0,5), ylim=c(0.8,1), 
     conf.int = T, main="Net survival \n (Reference group)", lwd=2)
axis(2,at=seq(0,1,by=0.05), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
mtext("Model 1 (B-Spline)", outer = TRUE, cex = 1.5)

predM1nsbasens <- predict(M1ns, time.pts = mytime, 
                          data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                IstageB=0, IstageC=0, IstageD=0, 
                                                Iagegrp1545=0, Iagegrp4555=0, Iagegrp5565=0, Iagegrp75pp=0))

predM1nsagegrp4555ns <- predict(M1ns, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      Iagegrp1545=0, Iagegrp4555=1, Iagegrp5565=0, Iagegrp75pp=0))

predM1nsagegrp75ppns <- predict(M1ns, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      Iagegrp1545=0, Iagegrp4555=0, Iagegrp5565=0, Iagegrp75pp=1))

# Plot of the baseline excess mortality hazard and the corresponding net survival 
par(mfrow=c(1,2), oma = c(0, 0, 2, 0))
plot(predM1nsbasens, which="hazard", xlim=c(0,5),ylim=c(0,0.1), 
     conf.int = T, main="Baseline excess \n mortality hazard", lwd=2)
axis(2,at=seq(0,0.2,by=0.02), tck=1,lty=8, lwd=0.1, labels=F, col="grey")

plot(predM1nsbasens, which="surv", xlim=c(0,5), ylim=c(0.8,1), 
     conf.int = T, main="Net survival \n (Reference group)", lwd=2)
axis(2,at=seq(0,1,by=0.05), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
mtext("Model 1 (Restricted Cubic Spline)", outer = TRUE, cex = 1.5)

#re-initialise par() parameters
par(mfrow=c(1,1), oma=c(0,0,0,0))

# Plot of the excess mortality hazard for 3 age-groups
plot(predM1bsbase, which="hazard", xlim=c(0,5),ylim=c(0,0.1), conf.int = F, lwd=2,cex.axis=1.5, cex.lab=1.5)
points(predM1bsagegrp4555, which="hazard", col="red", conf.int = F, lwd=2)
points(predM1bsagegrp75pp, which="hazard", col="blue", conf.int = F, lwd=2)

axis(2,at=seq(0,0.2,by=0.02), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Excess mortality hazard for different age-groups", cex.main=1.5)
legend(0.5, 0.1, c("[65-75[, men, Stage A, Deprivation level 1 (the baseline)", 
                   "[45-55[, men, Stage A, Deprivation level 1", 
                   "[75-++[, men, Stage A, Deprivation level 1"),
       col=c("black", "red", "blue"), lty=c(1,2,4), lwd=3, cex=1.1)

# Plot of the net survival for 3 age-groups
plot(predM1bsbase, which="surv", xlim=c(0,5),ylim=c(0.78,1), conf.int = F, cex.axis=1.5, cex.lab=1.5, lwd=2)
points(predM1bsagegrp4555, which="surv", col="red", conf.int = F, lwd=2)
points(predM1bsagegrp75pp, which="surv", col="blue", conf.int = F, lwd=2)

axis(2,at=seq(0,1,by=0.05), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Net survival for different age-groups", cex.main=1.8)
legend(0.4,0.84,c("[65-75[, men, Stage A, Deprivation level 1 (the baseline)",
                  "[45-55[, men, Stage A, Deprivation level 1",
                  "[75-++[, men, Stage A, Deprivation level 1"),
       col=c("black", "red", "blue"), lty=c(1,2,4), lwd=3, cex=1.1)

# Plot of the EHR for deprivation
HRdep <- data.frame(Estimates=exp(M1bs$coefficients[c("Idep2", "Idep3", "Idep4", "Idep5")]))
HRdep$LowBound <- exp(M1bs$coefficients[c("Idep2", "Idep3", "Idep4", "Idep5")]-qnorm(0.975)*M1bs$std.errors[c("Idep2", "Idep3", "Idep4", "Idep5")])
HRdep$UppBound <- exp(M1bs$coefficients[c("Idep2", "Idep3", "Idep4", "Idep5")]+qnorm(0.975)*M1bs$std.errors[c("Idep2", "Idep3", "Idep4", "Idep5")])

plot(x=seq(1,5), c(1,HRdep$Estimates), xlim=c(1,5), ylim=c(0.5,1.5),
     xlab="Deprivation quintiles", ylab="Excess Hazard Ratio", pch=18, lwd=3, cex.lab=1.5, cex.axis=1.5, cex=2)
axis(2, label=F, at=seq(0.5,2,by=0.1), tck=1,lty=8, lwd=0.1, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.4)
for (i in seq(2,5)){arrows(i,HRdep$LowBound[i-1],i,HRdep$UppBound[i-1],code=3,length=0.2,angle=90,lwd=3)}

# Plot of the EHR for stage 
HRstage <- data.frame(Estimates=exp(M1bs$coefficients[c("IstageB", "IstageC", "IstageD")]))
HRstage$LowBound <- exp(M1bs$coefficients[c("IstageB", "IstageC", "IstageD")]-qnorm(0.975)*M1bs$std.errors[c("IstageB", "IstageC", "IstageD")])
HRstage$UppBound <- exp(M1bs$coefficients[c("IstageB", "IstageC", "IstageD")]+qnorm(0.975)*M1bs$std.errors[c("IstageB", "IstageC", "IstageD")])

plot(seq(1,4), c(1, HRstage$Estimates), xlim=c(1,4), ylim=c(0.9,34),
     xlab="Stage", ylab="Excess Hazard Ratio (log scale)", pch=18, lwd=3, cex.lab=1.5, cex.axis=1.5, cex=2, 
     log="y", xaxt="n")
axis(side=1, at=seq(1,4), labels=c("Stage A", "Stage B", "Stage C", "Stage D"), cex.axis=1.5)
axis(2, label=F, at=c(1,2,seq(4,34,by=2)), tck=1,lty=8, lwd=0.1, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.2)
for (i in seq(1,3)){arrows(i+1,HRstage$LowBound[i],i+1,HRstage$UppBound[i],code=3,length=0.2,angle=90,lwd=3)}

# Plot of the EHR for age-groups
HRageM1bs <- data.frame(Estimates=exp(M1bs$coefficients[c("Iagegrp1545", "Iagegrp4555", "Iagegrp5565", "Iagegrp75pp")]))
HRageM1bs$LowBound <- exp(M1bs$coefficients[c("Iagegrp1545", "Iagegrp4555", "Iagegrp5565", "Iagegrp75pp")]-qnorm(0.975)*M1bs$std.errors[c("Iagegrp1545", "Iagegrp4555", "Iagegrp5565", "Iagegrp75pp")])
HRageM1bs$UppBound <- exp(M1bs$coefficients[c("Iagegrp1545", "Iagegrp4555", "Iagegrp5565", "Iagegrp75pp")]+qnorm(0.975)*M1bs$std.errors[c("Iagegrp1545", "Iagegrp4555", "Iagegrp5565", "Iagegrp75pp")])

plot(0,0, type="n",xlab="Age at diagnosis", ylab="Excess Hazard Ratio ",
     xlim=c(15,99), ylim=c(0.4,2), lwd=3, cex.lab=1.5, cex.axis=1.5)
x.step<-c(15,45,55,65,75,99)
n.step<-length(x.step)
y.step<-c(HRageM1bs[,c("Estimates")][1:3], 1, HRageM1bs[,c("Estimates")][4])
for (j in 1:(n.step-1))
{lines(c(x.step[j],x.step[j+1]),c(y.step[j],y.step[j]), lwd=5 ,col=c("orange", "red", "purple", "black", "blue")[j])
}
axis(2, label=F, at=seq(0.5,1.9,by=0.1), tck=1,lty=8, lwd=0.1, col="grey")
axis(1, label=F, at=c(15,45,55,65,75,99), tck=1,lty=8, lwd=0.2, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.2)
title("Excess hazard ratios for each age-group", cex.main=1.8)

Second Model: Age analysed assuming a linear effect

We fit a model M2bs assuming a cubic B-spline function for the (log of the) baseline excess hazard (arguments base="exp.bs" and degree=3), with one knot located at 1 year (argument knots to specify the location of the knot(s)). The model M2bs assumes proportional (i.e. constant over time) effects of the covariables dep and stage, and a linear and proportional effect of continuous age. The excess mortality hazard fitted is of the form λE(t;x)=λ0(t)exp(i=24αistagei+i=25βidepi+θagec). We also fit a model M2ns assuming a cubic restricted spline function for the (log of the) baseline excess hazard (using arguments base="exp.ns" and degree=3), with three interior knots located at 1, 3 and 5 years (using argument knots to specify the location of the knot(s)). The excess mortality hazard fitted is of the form λE(t;x)=λ0ns(t)exp(i=24αistagei+i=25βidepi+θagec).

Estimation

M2bs <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+agediagc, 
               data=temp, base="exp.bs", degree=3, knots=c(1), verbose = 0,  expected="rate")
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0
## Function Value
## [1] 25865.62
## Gradient:
##  [1]   1250.65358     60.29418    712.41550    826.00875    448.79197
##  [6]    390.35978    280.70016    149.20958     26.07932   2255.47749
## [11]   -149.00131  -1707.73004 -11952.96934
## 
## iteration = 136
## Parameter:
##  [1] -2.051158417 -1.864230192 -0.716370111 -3.969809881 -2.927287512
##  [6] -0.004077663  0.066438264  0.118533036  0.216144241  0.840944372
## [11]  1.840059212  3.183591187  0.017833676
## Function Value
## [1] 21880.73
## Gradient:
##  [1] -7.981297e-04  4.039531e-04 -4.001777e-05 -3.583168e-04 -9.569418e-05
##  [6]  4.438334e-04  2.386514e-03 -2.706656e-03 -6.511982e-04 -4.256435e-04
## [11] -2.669083e-04 -9.141824e-06 -7.919880e-03
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##   136 1684 exp.bs     20      10   nlm    ---    1 -21880.73      58.59
summary(M2bs)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + agediagc, data = temp, 
##     expected = "rate", base = "exp.bs", degree = 3, knots = c(1), 
##     verbose = 0)
## 
## Coefficients:
##             Estimate     StdErr  t.value   p.value    
## Intercept -2.0511584  0.1107754 -18.5164 < 2.2e-16 ***
## BS3.1     -1.8642302  0.0630470 -29.5689 < 2.2e-16 ***
## BS3.2     -0.7163701  0.1265172  -5.6622 1.517e-08 ***
## BS3.3     -3.9698099  0.2535376 -15.6577 < 2.2e-16 ***
## BS3.4     -2.9272875  0.3218373  -9.0956 < 2.2e-16 ***
## Idep2     -0.0040777  0.0396009  -0.1030  0.917989    
## Idep3      0.0664383  0.0397208   1.6726  0.094417 .  
## Idep4      0.1185330  0.0393795   3.0100  0.002616 ** 
## Idep5      0.2161442  0.0401820   5.3791 7.579e-08 ***
## IstageB    0.8409444  0.1069828   7.8606 4.041e-15 ***
## IstageC    1.8400592  0.1044068  17.6239 < 2.2e-16 ***
## IstageD    3.1835912  0.1049839  30.3246 < 2.2e-16 ***
## agediagc   0.0178337  0.0012109  14.7277 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##             Coef      HR CI.lower CI.upper
## Idep2    -0.0041  0.9959   0.9215   1.0763
## Idep3     0.0664  1.0687   0.9886   1.1552
## Idep4     0.1185  1.1258   1.0422   1.2162
## Idep5     0.2161  1.2413   1.1473   1.3430
## IstageB   0.8409  2.3186   1.8800   2.8595
## IstageC   1.8401  6.2969   5.1316   7.7269
## IstageD   3.1836 24.1333  19.6448  29.6473
## agediagc  0.0178  1.0180   1.0156   1.0204
## 
## log-likelihood: -21880.7301 (for 13 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972
M2ns <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+agediagc, 
               data=temp, base="exp.ns", degree=3, knots=c(1, 3, 5), verbose = 0, expected="rate")
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0
## Function Value
## [1] 27264.32
## Gradient:
##  [1]   5954.3076   1571.3816    206.2587   2198.5810  -1038.6589
##  [6]   1438.2777   1251.4660   1077.5460    803.3186   4501.5014
## [11]   1471.5020  -1489.3667 -17972.3525
## 
## iteration = 142
## Parameter:
##  [1] -2.597302235 -0.669652784 -2.313216320 -4.069920815 -1.514917628
##  [6] -0.006563437  0.062898402  0.117186398  0.218510063  0.833271022
## [11]  1.851551152  3.212173727  0.017925163
## Function Value
## [1] 22113.4
## Gradient:
##  [1] -0.0006961360 -0.0001273293  0.0001431151 -0.0006248149  0.0003313983
##  [6] -0.0001200533 -0.0002801244  0.0003165042 -0.0002728484 -0.0001055014
## [11] -0.0001611159 -0.0002763446  0.0108811946
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code   LogLik Total.Time
##   142 1762 exp.ns     20      10   nlm    ---    1 -22113.4      84.95
summary(M2ns)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + agediagc, data = temp, 
##     expected = "rate", base = "exp.ns", degree = 3, knots = c(1, 
##         3, 5), verbose = 0)
## 
## Coefficients:
##             Estimate     StdErr  t.value   p.value    
## Intercept -2.5973022  0.1111495 -23.3676 < 2.2e-16 ***
## NS3.1     -0.6696528  0.0709573  -9.4374 < 2.2e-16 ***
## NS3.2     -2.3132163  0.1378527 -16.7804 < 2.2e-16 ***
## NS3.3     -4.0699208  0.1726073 -23.5791 < 2.2e-16 ***
## NS3.4     -1.5149176  0.2933062  -5.1650 2.431e-07 ***
## Idep2     -0.0065634  0.0397768  -0.1650  0.868940    
## Idep3      0.0628984  0.0399127   1.5759  0.115066    
## Idep4      0.1171864  0.0395657   2.9618  0.003062 ** 
## Idep5      0.2185101  0.0403506   5.4153 6.198e-08 ***
## IstageB    0.8332710  0.1090335   7.6423 2.241e-14 ***
## IstageC    1.8515512  0.1063132  17.4160 < 2.2e-16 ***
## IstageD    3.2121737  0.1068601  30.0596 < 2.2e-16 ***
## agediagc   0.0179252  0.0012172  14.7271 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##             Coef      HR CI.lower CI.upper
## Idep2    -0.0066  0.9935   0.9189   1.0740
## Idep3     0.0629  1.0649   0.9848   1.1516
## Idep4     0.1172  1.1243   1.0404   1.2150
## Idep5     0.2185  1.2442   1.1496   1.3466
## IstageB   0.8333  2.3008   1.8581   2.8491
## IstageC   1.8516  6.3697   5.1715   7.8455
## IstageD   3.2122 24.8330  20.1402  30.6193
## agediagc  0.0179  1.0181   1.0157   1.0205
## 
## log-likelihood: -22113.3956 (for 13 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972

Prediction

# plot linear effect of age (on log scale) 
ageplot=(seq(15,90)-70)
binf.agediag=as.numeric(round(quantile(temp$agediag, na.rm=T, probs = c(0, 0.025, 0.25, 0.5, 0.75, 0.975, 1))[2],2))
bsup.agediag=as.numeric(round(quantile(temp$agediag, na.rm=T, probs = c(0, 0.025, 0.25, 0.5, 0.75, 0.975, 1))[6],2))


HRageM2bs <- exp(ageplot*M2bs$coefficients[c("agediagc")])
HRageM2ns <- exp(ageplot*M2ns$coefficients[c("agediagc")])
# Difference in EHR: 
round(HRageM2bs-HRageM2ns,3)
##  [1]  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002
## [11]  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002
## [21]  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.001
## [31]  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.001
## [41]  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.000
## [51]  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000
## [61] -0.001 -0.001 -0.001 -0.001 -0.001 -0.001 -0.001 -0.001 -0.002 -0.002
## [71] -0.002 -0.002 -0.002 -0.002 -0.002 -0.003
plot(ageplot+70, HRageM2bs, log="y", type="l",xlab="Age at diagnosis", ylab="Excess Hazard Ratio (log scale)",
     xlim=c(binf.agediag,bsup.agediag), ylim=c(0.4,2), lwd=3, cex.lab=1.5, cex.axis=1.5)
lines(ageplot+70, HRageM2ns, lwd=3, lty=8, col="blue")
axis(2, label=F, at=seq(0.4,2,by=0.1), tck=1,lty=8, lwd=0.1, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.5)
axis(1, label=F, at=70, tck=1,lty=8, lwd=0.1)
legend(50, 2, c("Cubic B-spline", "Restricted Cubic Spline"),
       col=c("black", "blue"), lty=c(1,8), lwd=3, cex=1.3)

Third Model: Age analysed assuming a Non-linear effect

We fit a model M3bs assuming a cubic B-spline function for the (log of the) baseline excess hazard (arguments base="exp.bs" and degree=3), with one knot located at 1 year (argument knots to specify the location of the knot(s)). The model M3bs assumes proportional (i.e. constant over time) effects of the covariables dep and stage, and a non-linear and proportional effect of continuous age, by including 2 supplemental terms for age in the linear predictor (through the function f. The excess mortality hazard fitted is of the form λE(t;x)=λ0(t)exp(i=24αistagei+i=25βidepi+βaagec+f(agec)). We also fit this model assuming a restrcited cubic spline for the (log of the) baseline excess hazard: M3ns.

Estimation

M3bs <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+agediagc+ agediagc2+ agediagc2plus, 
               data=temp, base="exp.bs", degree=3, knots=c(1), verbose = 0, expected="rate")
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 25865.62
## Gradient:
##  [1]   1250.65358     60.29418    712.41550    826.00875    448.79197
##  [6]    390.35978    280.70016    149.20958     26.07932   2255.47749
## [11]   -149.00131  -1707.73004 -11952.96934 124622.70458 -40400.48729
## 
## iteration = 246
## Parameter:
##  [1] -2.1201514937 -1.8587898394 -0.7004712283 -3.9634807198 -2.9253303092
##  [6] -0.0094254953  0.0633182359  0.1129814002  0.2168466202  0.8221389681
## [11]  1.8261687421  3.1684398685  0.0153299290  0.0003192303  0.0009137206
## Function Value
## [1] 21840.06
## Gradient:
##  [1] -8.584038e-03 -1.800484e-03 -3.232526e-04 -1.393210e-03 -1.490719e-04
##  [6] -8.061761e-05 -9.102405e-04 -1.868466e-03 -6.879054e-04 -1.819099e-03
## [11] -3.072553e-03 -3.797555e-03 -4.119713e-01  4.919920e+02  2.181588e+01
## 
## Le dernier pas global n'a pas pu localiser un point plus bas que x.
## Soit x est un mimimum local approximatif de la fonction,
## soit la fonction est par trop non linéaire pour cet algorithme,
## soit steptol est trop large.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##   246 3367 exp.bs     20      10   nlm    ---    3 -21840.06      118.4
summary(M3bs)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + agediagc + agediagc2 + 
##     agediagc2plus, data = temp, expected = "rate", base = "exp.bs", 
##     degree = 3, knots = c(1), verbose = 0)
## 
## Coefficients:
##                  Estimate      StdErr  t.value   p.value    
## Intercept     -2.12015149  0.11060372 -19.1689 < 2.2e-16 ***
## BS3.1         -1.85878984  0.06284037 -29.5795 < 2.2e-16 ***
## BS3.2         -0.70047123  0.12676401  -5.5258 3.326e-08 ***
## BS3.3         -3.96348072  0.25464698 -15.5646 < 2.2e-16 ***
## BS3.4         -2.92533031  0.32447606  -9.0156 < 2.2e-16 ***
## Idep2         -0.00942550  0.03953574  -0.2384  0.811570    
## Idep3          0.06331824  0.03963821   1.5974  0.110193    
## Idep4          0.11298140  0.03932639   2.8729  0.004072 ** 
## Idep5          0.21684662  0.04014884   5.4011 6.709e-08 ***
## IstageB        0.82213897  0.10652390   7.7179 1.245e-14 ***
## IstageC        1.82616874  0.10393854  17.5697 < 2.2e-16 ***
## IstageD        3.16843987  0.10452984  30.3113 < 2.2e-16 ***
## agediagc       0.01532993  0.00271382   5.6488 1.640e-08 ***
## agediagc2      0.00031923  0.00010284   3.1042  0.001911 ** 
## agediagc2plus  0.00091372  0.00029399   3.1080  0.001887 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                  Coef      HR CI.lower CI.upper
## Idep2         -0.0094  0.9906   0.9168   1.0704
## Idep3          0.0633  1.0654   0.9857   1.1514
## Idep4          0.1130  1.1196   1.0365   1.2093
## Idep5          0.2168  1.2422   1.1481   1.3439
## IstageB        0.8221  2.2754   1.8466   2.8037
## IstageC        1.8262  6.2100   5.0654   7.6133
## IstageD        3.1684 23.7704  19.3666  29.1755
## agediagc       0.0153  1.0154   1.0101   1.0209
## agediagc2      0.0003  1.0003   1.0001   1.0005
## agediagc2plus  0.0009  1.0009   1.0003   1.0015
## 
## log-likelihood: -21840.0601 (for 15 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972
M3ns <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+agediagc+ agediagc2+ agediagc2plus, 
               data=temp, base="exp.ns", degree=3, knots=c(1, 3, 5), verbose = 0, expected="rate")
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 27264.32
## Gradient:
##  [1]   5954.3076   1571.3816    206.2587   2198.5810  -1038.6589
##  [6]   1438.2777   1251.4660   1077.5460    803.3186   4501.5014
## [11]   1471.5020  -1489.3667 -17972.3525 668084.6915 132240.5401
## 
## iteration = 251
## Parameter:
##  [1] -2.6616493634 -0.6580455548 -2.3148111378 -4.0675392208 -1.5140238567
##  [6] -0.0123443323  0.0596319177  0.1111507060  0.2186027457  0.8142118804
## [11]  1.8365222025  3.1957520438  0.0154776232  0.0003205005  0.0009259950
## Function Value
## [1] 22072.98
## Gradient:
##  [1] -9.567696e-06  3.637979e-06 -4.714828e-06 -8.943930e-06  0.000000e+00
##  [6] -7.275958e-06 -3.637979e-06 -3.637979e-06 -3.637979e-06 -3.637979e-06
## [11] -9.904533e-06 -4.553518e-06  4.365575e-05 -2.110028e-03 -4.001777e-04
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##   251 3330 exp.ns     20      10   nlm    ---    1 -22072.98     168.22
summary(M3ns)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + agediagc + agediagc2 + 
##     agediagc2plus, data = temp, expected = "rate", base = "exp.ns", 
##     degree = 3, knots = c(1, 3, 5), verbose = 0)
## 
## Coefficients:
##                  Estimate      StdErr  t.value   p.value    
## Intercept     -2.66164936  0.11090745 -23.9988 < 2.2e-16 ***
## NS3.1         -0.65804555  0.07110720  -9.2543 < 2.2e-16 ***
## NS3.2         -2.31481114  0.13874410 -16.6840 < 2.2e-16 ***
## NS3.3         -4.06753922  0.17378921 -23.4050 < 2.2e-16 ***
## NS3.4         -1.51402386  0.29561326  -5.1216 3.060e-07 ***
## Idep2         -0.01234433  0.03970346  -0.3109  0.755870    
## Idep3          0.05963192  0.03981868   1.4976  0.134258    
## Idep4          0.11115071  0.03950365   2.8137  0.004903 ** 
## Idep5          0.21860275  0.04030874   5.4232 5.930e-08 ***
## IstageB        0.81421188  0.10850816   7.5037 6.496e-14 ***
## IstageC        1.83652220  0.10578403  17.3611 < 2.2e-16 ***
## IstageD        3.19575204  0.10634660  30.0503 < 2.2e-16 ***
## agediagc       0.01547762  0.00273025   5.6689 1.459e-08 ***
## agediagc2      0.00032050  0.00010311   3.1085  0.001883 ** 
## agediagc2plus  0.00092600  0.00029742   3.1134  0.001852 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                  Coef      HR CI.lower CI.upper
## Idep2         -0.0123  0.9877   0.9138   1.0677
## Idep3          0.0596  1.0614   0.9818   1.1476
## Idep4          0.1112  1.1176   1.0343   1.2075
## Idep5          0.2186  1.2443   1.1498   1.3466
## IstageB        0.8142  2.2574   1.8249   2.7924
## IstageC        1.8365  6.2747   5.0997   7.7204
## IstageD        3.1958 24.4285  19.8321  30.0903
## agediagc       0.0155  1.0156   1.0102   1.0210
## agediagc2      0.0003  1.0003   1.0001   1.0005
## agediagc2plus  0.0009  1.0009   1.0003   1.0015
## 
## log-likelihood: -22072.9753 (for 15 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972

Prediction

HRageM3bs <- exp(ageplot*M3bs$coefficients[c("agediagc")] +
                   ageplot^2*M3bs$coefficients[c("agediagc2")] +
                   ageplot^2*M3bs$coefficients[c("agediagc2plus")]*(ageplot>0) )
HRageM3ns <- exp(ageplot*M3ns$coefficients[c("agediagc")] +
                   ageplot^2*M3ns$coefficients[c("agediagc2")] +
                   ageplot^2*M3ns$coefficients[c("agediagc2plus")]*(ageplot>0) )
# Difference in EHR: 
round(HRageM3bs-HRageM3ns,3)
##  [1]  0.005  0.005  0.005  0.005  0.004  0.004  0.004  0.004  0.004  0.004
## [11]  0.004  0.004  0.004  0.004  0.004  0.003  0.003  0.003  0.003  0.003
## [21]  0.003  0.003  0.003  0.003  0.003  0.003  0.003  0.003  0.003  0.002
## [31]  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002  0.002
## [41]  0.002  0.002  0.001  0.001  0.001  0.001  0.001  0.001  0.001  0.001
## [51]  0.001  0.001  0.000  0.000  0.000  0.000  0.000  0.000 -0.001 -0.001
## [61] -0.001 -0.002 -0.002 -0.003 -0.003 -0.004 -0.004 -0.005 -0.006 -0.007
## [71] -0.009 -0.010 -0.012 -0.014 -0.016 -0.019
varcovnlin <- M3bs$vcov[c("agediagc", "agediagc2plus", "agediagc2plus"), c("agediagc", "agediagc2plus", "agediagc2plus")]

# Confidence Intervals of Nlin effect
vary=NULL
for (i in c(ageplot)){vary <-c(vary, c(i, i^2, (i>0)*i^2) %*% varcovnlin %*% c(i, i^2, (i>0)*i^2))}
yLB <- HRageM3bs*exp(-qnorm(0.975)*sqrt(vary))
yUB <- HRageM3bs*exp(qnorm(0.975)*sqrt(vary))


plot(ageplot+70, HRageM3bs, log="y", type="n",xlab="Age at diagnosis", ylab="Excess Hazard Ratio (log scale)", 
     xlim=c(binf.agediag,bsup.agediag), ylim=c(0.4,2), lwd=3, cex.lab=1.5, cex.axis=1.5, col="blue")
polygon(c(ageplot+70,rev(ageplot+70)),c(yUB,rev(yLB)), col=grey(0.8), border=F) 
lines(ageplot+70, HRageM3bs, lwd=3)
lines(ageplot+70, HRageM3ns, lwd=3, lty=8, col="blue")
axis(2, label=F, at=seq(0.4,2,by=0.1), tck=1,lty=8, lwd=0.1, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.2)
axis(1, label=F, at=70, tck=1,lty=1, lwd=0.2)
legend(50, 2, c("Cubic B-spline", "Restricted Cubic Spline"),
       col=c("black", "blue"), lty=c(1,8), lwd=3, cex=1.3)

Comparison between first, second and third model

# Plot comparing the effect of age at diagnosis according to the different models: Categorical vs linear vs non linear

plot(ageplot+70, HRageM2bs, log="y", type="l",xlab="Age at diagnosis", ylab="Excess Hazard Ratio (log scale)",
     xlim=c(binf.agediag,bsup.agediag), ylim=c(0.4,2), lwd=3, cex.lab=1.5, cex.axis=1.5)

lines(ageplot+70, HRageM3bs, type="l", lwd=3, col="blue")

x.step<-c(15,45,55,65,75,90)
n.step<-length(x.step)
y.step<-c(HRageM1bs[,c("Estimates")][1:3], 1, HRageM1bs[,c("Estimates")][4])
for (j in 1:(n.step-1))
{lines(c(x.step[j],x.step[j+1]),c(y.step[j],y.step[j]),lwd=3,col="red")
}
axis(2, label=F, at=seq(0.4,2,by=0.1), tck=1,lty=8, lwd=0.1, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.2)
axis(1, label=F, at=70, tck=1,lty=8, lwd=0.2)
legend(45,2,c("Effect of age with age-classes", "Linear effect of age", "Non-Linear effect of age"),
       col=c("red", "black", "blue"), lty=1, lwd=3,cex=1.1, bg="white")

Fourth Model: Age analysed assuming a Non-linear and a Time-dependent effect

We fit a model M4bs assuming a cubic B-spline function for the (log of the ) baseline excess hazard (arguments base="exp.bs" and degree=3), with one knot located at 1 year (argument knots to specify the location of the knot(s)). The model M4bs assumes proportional (i.e. constant over time) effects of the covariables dep and stage, and a non-linear and time-dependent effect of continuous age. The excess mortality hazard fitted is of the form λE(t;x)=λ0(t)exp(i=24αistagei+i=25βidepi+βa(t)agec+f(agec)). To fit this very complicated model, it’s useful to speed-up the optimisation process and to ensure reaching convergence to start by fitting a model for the overall hazard, and to use the parameters estimated from this model as initial values for the excess hazard regression model.

Estimation

M4bsoverall <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+
                        agediagc+ agediagc2+ agediagc2plus+ nph(agediagc), 
                      data=temp, base="exp.bs", degree=3, knots=c(1), verbose=0)
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 29263.43
## Gradient:
##  [1]   -1454.52510   -1009.66127      44.08982     412.48073     292.48885
##  [6]    -134.46506    -284.92816    -483.78169    -523.70191    1277.60311
## [11]   -1193.44837   -2196.44311  -34454.89916 -229928.59448 -374049.44891
## [16]   -8855.78410   -6096.80895   -5680.60689   -2912.08740
## 
## iteration = 282
## Parameter:
##  [1] -1.5007237982 -1.4781714121 -0.8149265313 -2.4986976124 -1.7847442126
##  [6]  0.0253381932  0.0951811483  0.1663045151  0.2726671179  0.3545031956
## [11]  1.0322656441  2.2815964098  0.0629500273  0.0006338815  0.0002258313
## [16] -0.0455737374 -0.0452616366  0.0131342679  0.0075273268
## Function Value
## [1] 24965.59
## Gradient:
##  [1]  2.666564e-05 -2.461135e-06 -3.637979e-06  1.019165e-05  6.115126e-06
##  [6] -4.729372e-05 -6.912160e-05  1.200533e-04  2.910383e-05 -2.546585e-05
## [11]  2.819413e-05  1.913386e-05  1.946319e-03 -8.618372e-03  2.314846e-02
## [16]  4.947651e-04  1.127773e-04  4.401954e-04  3.310561e-04
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##   282 4423 exp.bs     20      10   nlm    ---    1 -24965.59     267.34
initial <- round(M4bsoverall$coeff,2)

M4bs <- mexhaz(Surv(timey, dead)~Idep2+Idep3+Idep4+Idep5+IstageB+IstageC+IstageD+
                 agediagc+ agediagc2+ agediagc2plus+ nph(agediagc), 
               data=temp, base="exp.bs", degree=3, knots=c(1), verbose=0,  expected="rate", init=initial)
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1.50 -1.48 -0.81 -2.50 -1.78  0.03  0.10  0.17  0.27  0.35  1.03
## [12]  2.28  0.06  0.00  0.00 -0.05 -0.05  0.01  0.01
## Function Value
## [1] 22169.74
## Gradient:
##  [1]   1309.1132    432.1932    431.9112    343.4226    153.0077
##  [6]    269.6434    279.1382    301.1285    252.0926    788.7615
## [11]    413.9940   -117.1938   8363.2069 -10558.9898  69174.5661
## [16]   3533.2166   2897.2204   1989.5366    811.7438
## 
## iteration = 256
## Parameter:
##  [1] -2.191920461 -1.692536222 -0.727363895 -3.733211392 -3.060946755
##  [6] -0.013513911  0.066841749  0.116197303  0.218633102  0.829855130
## [11]  1.810852104  3.142993552  0.064570026  0.000177123  0.000328597
## [16] -0.066693262 -0.056157804 -0.060386972 -0.065451875
## Function Value
## [1] 21689.32
## Gradient:
##  [1] -2.224028e-04 -7.522986e-05  1.455192e-05 -4.775003e-05 -1.212285e-04
##  [6] -3.201421e-04 -7.639755e-05 -1.273293e-04  3.274181e-04 -9.094947e-05
## [11] -8.839544e-05 -1.261662e-04 -1.342414e-03 -7.305061e-03 -1.066655e-02
## [16] -1.164153e-04 -7.057679e-04 -2.837623e-04 -2.437446e-04
## 
## Gradient relatif proche de zéro.
## L'itération courante est probablement la solution.
## 
## Computation of the Hessian
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  temp     17867 17867    10972       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##   256 3778 exp.bs     20      10   nlm    ---    1 -21689.32     220.77
summary(M4bs)
## Call:
## mexhaz(formula = Surv(timey, dead) ~ Idep2 + Idep3 + Idep4 + 
##     Idep5 + IstageB + IstageC + IstageD + agediagc + agediagc2 + 
##     agediagc2plus + nph(agediagc), data = temp, expected = "rate", 
##     base = "exp.bs", degree = 3, knots = c(1), init = initial, 
##     verbose = 0)
## 
## Coefficients:
##                   Estimate      StdErr  t.value   p.value    
## Intercept      -2.19192046  0.10979383 -19.9640 < 2.2e-16 ***
## BS3.1          -1.69253622  0.06564902 -25.7816 < 2.2e-16 ***
## BS3.2          -0.72736390  0.13495625  -5.3896 7.150e-08 ***
## BS3.3          -3.73321139  0.28372563 -13.1578 < 2.2e-16 ***
## BS3.4          -3.06094675  0.39966416  -7.6588 1.972e-14 ***
## Idep2          -0.01351391  0.03970917  -0.3403  0.733618    
## Idep3           0.06684175  0.03976988   1.6807  0.092836 .  
## Idep4           0.11619730  0.03944915   2.9455  0.003229 ** 
## Idep5           0.21863310  0.04024356   5.4327 5.622e-08 ***
## IstageB         0.82985513  0.10508461   7.8970 3.021e-15 ***
## IstageC         1.81085210  0.10254793  17.6586 < 2.2e-16 ***
## IstageD         3.14299355  0.10307628  30.4919 < 2.2e-16 ***
## agediagc        0.06457003  0.00434106  14.8742 < 2.2e-16 ***
## agediagc2       0.00017712  0.00010463   1.6929  0.090489 .  
## agediagc2plus   0.00032860  0.00030008   1.0950  0.273525    
## agediagc*BS3.1 -0.06669326  0.00598991 -11.1343 < 2.2e-16 ***
## agediagc*BS3.2 -0.05615780  0.01137211  -4.9382 7.956e-07 ***
## agediagc*BS3.3 -0.06038697  0.02266408  -2.6644  0.007719 ** 
## agediagc*BS3.4 -0.06545187  0.02916930  -2.2439  0.024854 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                  Coef      HR CI.lower CI.upper
## Idep2         -0.0135  0.9866   0.9127   1.0664
## Idep3          0.0668  1.0691   0.9890   1.1558
## Idep4          0.1162  1.1232   1.0396   1.2135
## Idep5          0.2186  1.2444   1.1500   1.3465
## IstageB        0.8299  2.2930   1.8662   2.8174
## IstageC        1.8109  6.1157   5.0021   7.4772
## IstageD        3.1430 23.1731  18.9339  28.3615
## agediagc       0.0646  1.0667   1.0577   1.0758
## agediagc2      0.0002  1.0002   1.0000   1.0004
## agediagc2plus  0.0003  1.0003   0.9997   1.0009
## 
## log-likelihood: -21689.3213 (for 19 degree(s) of freedom)
## 
## number of observations: 17867, number of events: 10972

Prediction

myagediag <- 50
myagediagc <- myagediag-70 
myagediagc2 <- myagediagc^2 
myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)
predM4bsage50 <- predict.mexhaz(M4bs, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))
myagediag <- 60
myagediagc <- myagediag-70 
myagediagc2 <- myagediagc^2 
myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)
predM4bsage60 <- predict.mexhaz(M4bs, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))
myagediag <- 70
myagediagc <- myagediag-70 
myagediagc2 <- myagediagc^2 
myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)
predM4bsage70 <- predict.mexhaz(M4bs, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))

myagediag <- 80
myagediagc <- myagediag-70 
myagediagc2 <- myagediagc^2 
myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)
predM4bsage80 <- predict.mexhaz(M4bs, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))
par(mar=c(5,4,4,2)+0.1)
plot(0,1, xlim=c(0,5),ylim=c(0,0.15), type="n",
     xlab="Time since diagnosis",ylab="Excess Hazard",cex.lab=1.5, cex.axis=1.5, cex=1.5)
points(predM4bsage50, which="hazard", lwd=3, col="green", conf.int = F)
points(predM4bsage60, which="hazard", lwd=3, col="orange",lty.pe=4, conf.int = F)
points(predM4bsage70, which="hazard", lwd=3, col="blue",lty.pe=2, conf.int = F)
points(predM4bsage80, which="hazard", lwd=3, col="red",lty.pe=6, conf.int = F)
axis(2,at=seq(0,0.25,by=0.05), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Estimates of the excess hazard for different ages", cex.main=1.5)
mtext("Men, Stage A, Deprivation level 1", cex=1.2)
legend(1,0.15,c("50 years old", "60 years old", "70 years old", "80 years old"),
       col=c("green", "orange", "blue", "red"), lty=c(1,4,2,6), lwd=3,cex=1.5, bg="white")

excratefixnlinnphage <- data.frame(Thetime=mytime)
rangeage <- seq(35,90,by=1)

for (agestudy in rangeage){
  myagediag <- agestudy
  myagediagc <- myagediag-70 
  myagediagc2 <- myagediagc^2 
  myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)
  predage <- predict.mexhaz(M4bs, time.pts = mytime, 
                              data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                    IstageB=0, IstageC=0, IstageD=0, 
                                                  agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))
  excratefixnlinnphage[,paste0("age",agestudy)] <- predage$results$hazard
  }


par(mar=c(5,4,4,2)+0.1)
plot(0, 0 , type="n",
     xlim=c(min(rangeage),max(rangeage)),ylim=c(0,2),
     xlab="Age at diagnosis",ylab="Excess hazard ratio",cex.lab=1.5, cex.axis=1.5, cex=1.5
)
lines(rangeage, excratefixnlinnphage[excratefixnlinnphage$Thetime=="0.3",][-1]/excratefixnlinnphage[excratefixnlinnphage$Thetime=="0.3",c("age70")], lwd=3, col="red")
lines(rangeage, excratefixnlinnphage[excratefixnlinnphage$Thetime=="5",][-1]/excratefixnlinnphage[excratefixnlinnphage$Thetime=="5",c("age70")], lwd=3, col="blue", lty=4)
axis(2,at=seq(0,2,by=0.2), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
axis(2, label=F, at=1, tck=1,lty=1, lwd=0.2)
axis(1, label=F, at=70, tck=1,lty=8, lwd=0.2)
title("Estimated non-linear effect of age at different times", cex.main=1.5)
mtext("Men, Stage A, Deprivation level 1", cex=1.2)
legend(36,2,c("Non-Linear effect of age at 4 month", "Non-Linear effect of age at 5 years"),
       col=c("red", "blue"), lty=c(1,4), lwd=3, cex=1.3, bg="white")

par(mar=c(5,4,4,2)+0.1)
z <- excratefixnlinnphage[,-1] 
persp(x=rangeage, y=mytime, z = t(as.matrix(z)), xlab="Age at diagnosis",ylab="Time since diagnosis", zlab="Excess hazard", 
      theta = 120, phi = 20, expand = 0.5, col = "lightblue",
      xlim=c(min(rangeage),max(rangeage)), ylim=c(min(mytime),max(mytime)), shade = 0.75, ticktype = "detailed" )
title("Excess hazard according to \n time since diagnosis and age at diagnosis", cex.main=1.2)

Model comparison

After fitting the three models with different complexity of the association between age at diagnosis and the excess hazard regression model, the natural question is to assess which of those 3 models provide the better fit. The Akaike Information Criterion calculated below shows that the fourth model (non-linear and time-dependent effect of age) provides the better fit.

Akaike Information Criterion

#LOG-LIKELIHOOD
M2bs$loglik
## [1] -21880.73
M3bs$loglik
## [1] -21840.06
M4bs$loglik
## [1] -21689.32
# AKAIKE INFORMATION CRITERIA
-2 * M2bs$loglik + 2 * M2bs$n.par
## [1] 43787.46
-2 * M3bs$loglik + 2 * M3bs$n.par
## [1] 43710.12
-2 * M4bs$loglik + 2 * M4bs$n.par
## [1] 43416.64

Comparison of the excess mortality hazard and the net survival estimated from the 3 different models for a 80 years old patient

We can compare the estimated values of the excess hazard and the corresponding net survival obtained from the 3 different models for specific age.

myagediag <- 80
myagediagc <- myagediag-70 
myagediagc2 <- myagediagc^2 
myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)

predagelin <- predict.mexhaz(M2bs, time.pts = mytime, 
                           data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                 IstageB=0, IstageC=0, IstageD=0, 
                                                 agediagc=myagediagc))
predageNlin <- predict.mexhaz(M3bs, time.pts = mytime, 
                               data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                     IstageB=0, IstageC=0, IstageD=0, 
                                                     agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))
predageNlinTD <- predict.mexhaz(M4bs, time.pts = mytime, 
                               data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                     IstageB=0, IstageC=0, IstageD=0, 
                                                     agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))

par(mar=c(5,4,4,2)+0.1)
plot(predagelin, which="hazard", conf.int=F, lwd=2, lty.pe=2) 
points(predageNlin, which="hazard", col="blue", conf.int=F, lwd=2, lty.pe=4) 
points(predageNlinTD, which="hazard", col="green", conf.int=F, lwd=2) 
axis(2,at=seq(0,1,by=0.02), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Excess mortality hazard estimated from the different models", cex.main=1.2)
mtext("80 years old -  Men, Stage A, Deprivation level 1", cex=1.2)
legend(0.5,0.08,c("Linear effect of age", "Non-Linear effect of age", "Non-linear & time-dependent effect of age"),
       col=c("black", "blue", "green"), lty=c(2,4,1), lwd=3,cex=1.2, bg="white")

par(mar=c(5,4,4,2)+0.1)
plot(0, 0, type="n", xlim=c(0,5),ylim=c(0.80,1),
     xlab="Time since diagnosis",ylab="Net Survival",cex.lab=1.5, cex.axis=1.5, cex=1.5)
points(predagelin, which="surv", conf.int=F, lwd=2, lty.pe=2) 
points(predageNlin, which="surv", col="blue", conf.int=F, lwd=2, lty.pe=4) 
points(predageNlinTD, which="surv", col="green", conf.int=F, lwd=2) 

axis(2,at=seq(0,1,by=0.05), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Net survival estimated using the different models", cex.main=1.5)
mtext("80 years old -  Men, Stage A, Deprivation level 1", cex=1.2)
legend(0,0.87,c("Linear effect of age", "Non-Linear effect of age", "Non-linear & time-dependent effect of age"),
       col=c("black", "blue", "green"), lty=c(2,4,1), lwd=3,cex=1.2, bg="white")

Comparison of the excess mortality hazard and the net survival estimated from the 3 different models for a 55 years old patient

myagediag <- 55
myagediagc <- myagediag-70 
myagediagc2 <- myagediagc^2 
myagediagc2plus <- (myagediagc-0)^2*(myagediagc>0)

predagelin <- predict.mexhaz(M2bs, time.pts = mytime, 
                             data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                   IstageB=0, IstageC=0, IstageD=0, 
                                                   agediagc=myagediagc))
predageNlin <- predict.mexhaz(M3bs, time.pts = mytime, 
                              data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                    IstageB=0, IstageC=0, IstageD=0, 
                                                    agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))
predageNlinTD <- predict.mexhaz(M4bs, time.pts = mytime, 
                                data.val = data.frame(Idep2=0, Idep3=0, Idep4=0, Idep5=0,
                                                      IstageB=0, IstageC=0, IstageD=0, 
                                                      agediagc=myagediagc, agediagc2=myagediagc2, agediagc2plus=myagediagc2plus))

par(mar=c(5,4,4,2)+0.1)
plot(predagelin, which="hazard", conf.int=F, lwd=2, lty.pe=2) 
points(predageNlin, which="hazard", col="blue", conf.int=F, lwd=2, lty.pe=4) 
points(predageNlinTD, which="hazard", col="green", conf.int=F, lwd=2) 
axis(2,at=seq(0,1,by=0.02), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Excess mortality hazard estimated from the different models", cex.main=1.2)
mtext("55 years -  Men, Stage A, Deprivation level 1", cex=1.2)
legend(0.5,0.06,c("Linear effect of age", "Non-Linear effect of age", "Non-linear & time-dependent effect of age"),
       col=c("black", "blue", "green"), lty=c(2,4,1), lwd=3,cex=1.2, bg="white")

par(mar=c(5,4,4,2)+0.1)
plot(0, 0, type="n", xlim=c(0,5),ylim=c(0.80,1),
     xlab="Time since diagnosis",ylab="Net Survival",cex.lab=1.5, cex.axis=1.5, cex=1.5)
points(predagelin, which="surv", conf.int=F, lwd=2, lty.pe=2) 
points(predageNlin, which="surv", col="blue", conf.int=F, lwd=2, lty.pe=4) 
points(predageNlinTD, which="surv", col="green", conf.int=F, lwd=2) 

axis(2,at=seq(0,1,by=0.05), tck=1,lty=8, lwd=0.1, labels=F, col="grey")
title("Net survival estimated from the different models", cex.main=1.5)
mtext("55 years -  Men, Stage A, Deprivation level 1", cex=1.2)
legend(0,0.88,c("Linear effect of age", "Non-Linear effect of age", "Non-linear & time-dependent effect of age"),
       col=c("black", "blue", "green"), lty=c(2,4,1), lwd=3,cex=1.2, bg="white")