Extending eventglm

library(survival)
library(eventglm)

Pseudo observation computation modules

As of version 1.1.0, cumincglm and rmeanglm expect the argument model.censoring to be a function. This function is the workhorse that does the computation of the pseudo observations that are later used in the generalized linear model. A number of computation methods are built in as “modules” in the file called “pseudo-modules.R”. Let us take a look at an example:

eventglm::pseudo_independent
#> function (formula, time, cause = 1, data, type = c("cuminc", 
#>     "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL) 
#> {
#>     margformula <- update.formula(formula, . ~ 1)
#>     mr <- model.response(model.frame(margformula, data = data))
#>     stopifnot(attr(mr, "type") %in% c("right", "mright"))
#>     marginal.estimate <- survival::survfit(margformula, data = data)
#>     if (type == "cuminc") {
#>         POi <- get_pseudo_cuminc(marginal.estimate, time, cause, 
#>             mr)
#>     }
#>     else if (type == "survival") {
#>         if (marginal.estimate$type != "right") {
#>             stop("Survival estimand not available for outcome with censoring type", 
#>                 marginal.estimate$type)
#>         }
#>         POi <- 1 - get_pseudo_cuminc(marginal.estimate, time, 
#>             cause, mr)
#>     }
#>     else if (type == "rmean") {
#>         POi <- get_pseudo_rmean(marginal.estimate, time, cause, 
#>             mr)
#>     }
#>     POi
#> }
#> <bytecode: 0x0000000025aa9c80>
#> <environment: namespace:eventglm>

This function, and any pseudo observation module, must take the same named arguments (though they do not all have to be used), and return a vector of pseudo observations.

Custom computation functions

Example: Parametric pseudo observations

Let us see how to define a custom function for computation of pseudo observations. In this first example, we will fit a parametric survival model with survreg marginally and do jackknife leave one out estimates. This may be useful if there is interval censoring, for example.

pseudo_parametric <- function(formula, time, cause = 1, data,
                        type = c("cuminc", "survival", "rmean"),
                        formula.censoring = NULL, ipcw.method = NULL){

  margformula <- update.formula(formula, . ~ 1)
  mr <- model.response(model.frame(margformula, data = data))
  
  marginal.estimate <- survival::survreg(margformula, data = data, 
                                         dist = "weibull")

  theta <- pweibull(time, shape = 1 / marginal.estimate$scale, 
                    scale = exp(marginal.estimate$coefficients[1]))
  
  theta.i <- sapply(1:nrow(data), function(i) {
    
    me <- survival::survreg(margformula, data = data[-i, ], dist = "weibull")
    pweibull(time, shape = 1 / me$scale, 
                    scale = exp(me$coefficients[1]))
  
    
  })
  
  POi <- theta  + (nrow(data) - 1) * (theta - theta.i)
  POi

}

Now let us try it out by passing it to the cumincglm function and compare to the default independence estimator:

fitpara <- cumincglm(Surv(time, status) ~ rx + sex + age, time = 2500, 
                  model.censoring = pseudo_parametric, 
                  data = colon)

fitdef <- cumincglm(Surv(time, status) ~ rx + sex + age, time = 2500, 
                  model.censoring = "independent", 
                  data = colon)

knitr::kable(sapply(list(parametric = fitpara, default = fitdef), 
       coefficients))
parametric default
(Intercept) 0.5473823 0.4891055
rxLev -0.0216382 -0.0292873
rxLev+5FU -0.1488142 -0.1326516
sex 0.0008129 -0.0102263
age 0.0004233 0.0010047

You can also refer to the function with a string, omitting the “pseudo_” prefix, if you wish, e.g.,

fit1 <- cumincglm(Surv(time, status) ~ rx + sex + age, time = 2500, 
                  model.censoring = "parametric", 
                  data = colon)

Example 2: infinitesimal jackknife

When the survival package version 3.0 was released, it became possible to get the influence function values returned from some estimation functions. These efficient influence functions are used in the variance calculations, and they are related to pseudo observations. More information is available in the “pseudo.Rnw” vignette of the development version of survival. We can use this feature to create a custom function for infinitesimal jackknife pseudo observations:

pseudo_infjack <- function(formula, time, cause = 1, data,
                        type = c("cuminc", "survival", "rmean"),
                        formula.censoring = NULL, ipcw.method = NULL) {
  
  marginal.estimate2 <- survival::survfit(update.formula(formula, . ~ 1),
                                             data = data, influence = TRUE)

     tdex <- sapply(time, function(x) max(which(marginal.estimate2$time <= x)))

     pstate <- marginal.estimate2$surv[tdex]

     ## S(t) + (n)[S(t) -S_{-i}(t)]
     POi <- matrix(pstate, nrow = marginal.estimate2$n, ncol = length(time), byrow = TRUE) +
         (marginal.estimate2$n) *
         (marginal.estimate2$influence.surv[, tdex + 1])
     
     POi

}

Note that this computes pseudo observations for survival, rather than the cumulative incidence, so to compare we can use the survival = TRUE option. Now we try it out

fitinf <- cumincglm(Surv(time, status) ~ rx + sex + age, time = 2500, 
                  model.censoring = "infjack", 
                  data = colon)

fitdefsurv <- cumincglm(Surv(time, status) ~ rx + sex + age, time = 2500, 
                  survival = TRUE,
                  data = colon)


knitr::kable(sapply(list(infjack = fitinf, default = fitdefsurv), 
       coefficients))
infjack default
(Intercept) 0.5108264 0.5108945
rxLev 0.0292609 0.0292873
rxLev+5FU 0.1326361 0.1326516
sex 0.0102568 0.0102263
age -0.0010036 -0.0010047

Conclusion

This opens up a lot of possibilities for future extensions of eventglm, and will also make it easier to maintain. Try it out with different methods, and let us know what methods you are using or would like to use.