In this vignette, we briefly introduce how to simulate survival and
recurrent event data from stochastic process point of view with the
**reda** package. The core function named
`simEvent()`

provides an intuitive and flexible interface for
simulating survival and recurrent event times from one stochastic
process. Another function named `simEventData()`

is simply a
wrapper that calls `simEvent()`

internally and collects the
event times and the covariates of a given number of processes into a
data frame. Examples of generating random samples from common survival
and recurrent models are provided. In fact, the function
`simEvent()`

and `simEventData()`

may serve as the
building blocks for simulating multitype event data including multiple
type event data, recurrent events with termination, and competing risk
data. The details of function syntax and the objects produced are
available in the package manual and thus not covered in this
vignette.

We introduce the general form of hazard rate function considered and
implemented in the function `simEvent()`

, which can be
generalized to two classes called the relative risk model, and the
accelerated failure time model. The introduction is based on Section 2.2
and Section 2.3 of Kalbfleisch and Prentice
(2002). Other helpful references include Aalen, Borgan, and Gjessing (2008), Kleinbaum and Klein (2011), among others.

Let’s consider \(n\) stochastic processes with the baseline hazard rate function \(\rho(t)\) of time \(t\). For the stochastic process \(i\), \(i\in\{1,\ldots,n\}\), let \(\mathbf{z}_i=(z_{i1},\ldots,z_{ip})^{\top}\) denote the covariate vector of length \(p\), and \(\boldsymbol{\beta}=(\beta_1,\ldots,\beta_p)^{\top}\) denote the covariate coefficients.

Given the covariates \(\mathbf{z}_i\) (not including the intercept term), the intensity function of time \(t\) for the relative risk regression model can be specified as follows: \[\lambda(t \mid \mathbf{z}_i) = \rho(t)\,r(\mathbf{z}_i, \boldsymbol{\beta}),\] where \(r(\mathbf{z}_i, \boldsymbol{\beta})\) is the relative risk function. For the Cox model (Cox 1972) and the Andersen-Gill model (Andersen and Gill 1982), \(r(\mathbf{z}_i, \boldsymbol{\beta})= \exp\{\boldsymbol{\beta}^{\top}\mathbf{z}_i\}\). Other common choices include the linear relative risk function: \(r(\mathbf{z}_i, \boldsymbol{\beta}) = 1 + \boldsymbol{\beta}^{\top}\mathbf{z}_i\), and the excess relative risk function: \(r(\mathbf{z}_i, \boldsymbol{\beta}) = \prod_{j=1}^p(1 + \beta_j z_{ij})\), both of which, however, suffer from the drawback that the \(r(\mathbf{z}_i, \boldsymbol{\beta})\) is not necessarily positive. Therefore, the coefficient estimates must be restricted to guarantee that the relative risk function is positive for all possible covariates. The restriction disappears for the exponential relative risk function since it always returns positive values.

We may extend the model by considering a random frailty effect \(u_i\) (of expectation one) to account for heterogeneity between different processes or processes from different clusters. The intensity function becomes \[\lambda(t \mid \mathbf{z}_i, u_i) = u_i\,\rho(t)\,r(\mathbf{z}_i, \boldsymbol{\beta}).\] The common choices for distribution of the random frailty effect include Gamma distribution of mean one, and lognormal distribution of mean zero in logarithm scale.

Furthermore, both covariates and coefficients may be time-varying. So a general form of the intensity function for the relative risk model is given as follows: \[\lambda\bigl(t \mid \mathbf{z}_i(t), u_i\bigr) = u_i\,\rho(t)\,r\bigl(\mathbf{z}_i(t), \boldsymbol{\beta}(t)\bigr).\]

The relative risk models incorporate the covariates and their coefficients through a relative risk function multiplied by the baseline hazard rate, which provides an intuitive interpretation. However, we may consider a direct relationship between the covraiates \(\mathbf{z}\) (including the intercept term) and the time to failure \(T>0\), \(\log(T) = \alpha + \sigma W\), where \(W\) is a standardized random error variable with density function \(f_W(w)\), suvival function \(S_W(w)\) and hazard function \(\rho(w) = f_W(w) / S_W(w)\), \(\sigma\) represents the standard error, and \(\alpha = \boldsymbol{\beta}^{\top}\mathbf{z}\). We assume that \(W\) is independent of \(\boldsymbol{\beta}\) given the covariates \(\mathbf{z}\). Taking exponentiation gives \(T=\exp(\boldsymbol{\beta}^{\top}\mathbf{z})\exp(\sigma W)\) with density \(f_T(t)=\frac{1}{\sigma t}f_W\bigl((\log(t) - \alpha)/\sigma\bigr)\), survival function \(S_T(t) = S_W\bigl((\log(t) - \alpha)/\sigma\bigr)\), and hazard function \[\lambda(t \mid \mathbf{z}_i) = \frac{1}{\sigma t}\rho\bigl((\log(t) - \alpha)/\sigma\bigr).\] Let \(\lambda_{\mathbf{z}}=\exp(- \alpha)\) and \(v = 1 / \sigma\), we may rewrite the hazard function as follows: \[\lambda(t \mid \mathbf{z}_i) = \frac{v}{t}\rho\bigl(v\log(\lambda_{\mathbf{z}} t)\bigr).\] The resulting model is called accelerated failure time (AFT) model.

For example, suppose \(W\) follows standard logistic distribution with density \(f_W(w) = e^w / (1 + e^w) ^ 2\), survival function \(S_W(w) = 1 / (1 + e^w)\), and hazard function \(\rho(w) = e^w / (1 + e^w)\). This leads to log-logistic model of \(T\) with hazard function \[\lambda(t \mid \mathbf{z}_i) = \frac{v (t\lambda_{\mathbf{z}})^v/t}{1 + (t\lambda_{\mathbf{z}})^v}.\]

Similarly, we may further consider frailty factor, time-variant covariates, and time-varying covariate coefficients. So a general form of the intensity function may be given as follows: \[\lambda\bigl(t \mid \mathbf{z}_i(t), u_i\bigr) = \frac{u_i\,v}{t}\,\rho\bigl( v \log(t) - v\, \boldsymbol{\beta}(t)^{\top}\mathbf{z}_i(t)\bigr).\]

The function `simEvent()`

and `simEventData()`

allow users to specify an intensity function in an even more general
form given below. \[\lambda\bigl(t \mid
\mathbf{z}_i(t), u_i\bigr) = u_i\,\rho\bigl(t, \mathbf{z}_i(t),
\boldsymbol{\beta}(t)\bigr) \,r\bigl(\mathbf{z}_i(t),
\boldsymbol{\beta}(t)\bigr),\] where \(u_i\), \(\rho(\cdot)\), and \(r(\cdot)\) corresponds to the argument
`frailty`

, `rho`

, and `relativeRisk`

,
respectively.

The thinning method (Lewis and Shedler
1979) and the inversion method (Çınlar
1975) are implemented for sampling event times. It can be shown
that both methods achieve the given hazard rate. For function
`simEvent()`

, the thinning method is the default method when
the hazard rate function is bounded within follow-ups. Otherwise, the
inversion method will be used. We may specify the sampling method via
the argument `method`

in function `simEvent()`

and
`simEventData()`

.

```
library(reda) # attach reda package to the search path
packageVersion("reda") # check the package version
```

`## [1] '0.5.4'`

```
options(digits = 3) # set the number of significant digits to print
set.seed(123) # set random number seed
```

A homogeneous/stationary Poisson process (HPP) has a constant hazard
rate over time with the interarrival times (between two successive
arrivals/events) following exponential distribution. Two simple examples
of simulating a homogeneous Poisson process using
`simEvent()`

are given as follows:

```
## HPP from time 1 to 5 of intensity 1 without covariates
simEvent(rho = 1, origin = 1, endTime = 5)
```

```
## 'simEvent' S4 class object:
## [1] 1.84 2.42 3.75 3.78 3.84 4.15 4.47 4.61
```

```
## HPP from 0 to 10 of baseline hazard rate 0.5 with two covariates
simEvent(z = c(0.2, 0.5), zCoef = c(0.5, - 0.1), rho = 0.5, endTime = 10)
```

```
## 'simEvent' S4 class object:
## [1] 0.717 1.076 2.692 5.666 6.577 7.701
```

The function `simEventData()`

enable us to simulate
multiple processes and collect the simulated event times into a survival
or recurrent event data format.

```
## recurrent events from two processes with same covariates
simEventData(2, z = c(0.2, 0.5), zCoef = c(1, - 0.5), rho = 0.5, endTime = 5)
```

```
## ID time event origin X.1 X.2
## 1 1 3.38 1 0 0.2 0.5
## 2 1 5.00 0 0 0.2 0.5
## 3 2 1.84 1 0 0.2 0.5
## 4 2 5.00 0 0 0.2 0.5
```

In the example given above, the number of process is explicitly specified to be two. However, if it is not specified, it will be the number of rows of the covariate matrix. See the example given below.

```
## recurrent events from two processes
## with different time-invariant covariates and time origins
simEventData(z = cbind(rnorm(2), 0.5), zCoef = c(1, - 0.5),
rho = 0.2, origin = c(1, 0), endTime = c(10, 9))
```

```
## ID time event origin X.1 X.2
## 1 1 4.91 1 1 0.838 0.5
## 2 1 6.31 1 1 0.838 0.5
## 3 1 7.04 1 1 0.838 0.5
## 4 1 10.00 0 1 0.838 0.5
## 5 2 3.44 1 0 0.153 0.5
## 6 2 6.68 1 0 0.153 0.5
## 7 2 9.00 0 0 0.153 0.5
```

We can also simulate survival data by taking the first event of each
process. Setting `recurrent = FALSE`

in function
`simEvent()`

(or function `simEventData()`

) gives
us the survival time(s) or the right censoring time(s). In the example
given below, we specified `endTime = "rnorm"`

and
`arguments = list(endTime = list(mean = 10))`

for generating
a random censoring times from normal distribution with mean ten and unit
standard deviation. Also note that the specified `origin`

is
recycled for these ten processes.

```
## survival data by set 'recurrent = FALSE'
simEventData(z = cbind(rnorm(10), 1), zCoef = c(0.2, - 0.5), rho = 0.1,
origin = c(0, 1), endTime = stats::rnorm, recurrent = FALSE,
arguments = list(endTime = list(mean = 10)))
```

```
## ID time event origin X.1 X.2
## 1 1 9.11 0 0 -0.772 1
## 2 2 10.38 0 1 0.287 1
## 3 3 4.55 1 0 -1.221 1
## 4 4 10.39 0 1 0.435 1
## 5 5 9.55 1 0 0.800 1
## 6 6 8.97 0 1 -0.164 1
## 7 7 8.25 1 0 1.243 1
## 8 8 10.96 0 1 -0.934 1
## 9 9 10.08 0 0 0.394 1
## 10 10 11.13 0 1 0.404 1
```

In contrast to HPP, a nonhomogeneous Poisson process (NHPP) has a
time-varying hazard rate function. In that case, we may specify the
baseline hazard rate function `rhoFun()`

to be a function
object whose first argument represents the time variable. A quick
example is given below, where the baseline hazard rate function \(\rho(t) = \sin(t) + 1\).

```
rhoFun <- function(x, b = 1) (sin(x) + 1) * b
simEvent(rho = rhoFun)
```

```
## 'simEvent' S4 class object:
## [1] 1.15
```

As demonstrated in the last example for HPP, other possible arguments
of the function objects can be specified via the `arguments`

.
For example, that `arguments = list(rho = list(b = 0.5))`

specifies the baseline hazard rate function to be \(\rho(t) = 0.5(\sin(t) + 1)\).

```
simEventData(z = cbind(rexp(2), c(0, 1)), zCoef = c(0.1, - 0.5),
rho = rhoFun, arguments = list(rho = list(b = 0.5)))
```

```
## ID time event origin X.1 X.2
## 1 1 3.000 0 0 0.0687 0
## 2 2 0.715 1 0 0.5692 1
## 3 2 1.120 1 0 0.5692 1
## 4 2 2.208 1 0 0.5692 1
## 5 2 3.000 0 0 0.5692 1
```

In the Poisson process, the interarrival times between two successive arrivals (or events) follow exponential distribution independently. We may generalize the distribution of interarrival times and consider more general renewal processes.

In function `simEvent()`

(and
`simEventData()`

), we may specify the distribution of the
interarrival times via the argument `interarrival`

, which
takes the function `stats::rexp()`

for generating
interarrival times following exponential distribution by default. In
general, the argument `interarrival`

takes a function with at
least one argument named `rate`

for generating random
(nonnegative) interarrival times from a certain distribution at the
given arrival rate. A quick example of generating the interarrival times
following Gamma distribution of scale one is given as follows:

```
set.seed(123)
simEvent(interarrival = function(n, rate) rgamma(n, shape = 1 / rate))
```

```
## 'simEvent' S4 class object:
## [1] 0.182 1.878
```

If the specified function has an argument named `n`

, the
function `simEvent()`

will assume that the function can
generate `n`

number of random interarrival times at one time
and take advantage of the vectorization for a potentially better
performance. However, it is optional. The example given below produces
an equivalent result.

```
set.seed(123)
simEvent(interarrival = function(rate) rgamma(n = 1, shape = 1 / rate))
```

```
## 'simEvent' S4 class object:
## [1] 0.182 1.878
```

In practice, some covariates such as patients’ age, automobile’s
mileage may vary over time. The argument `z`

in the function
`simEvent()`

and `simEventData()`

may take a
function of time that returns a vector of covariates for generating
event times with the time-varying covariates. Let’s consider an example
of generating recurrent event times with three covariates, where two of
which are time-variant.

```
set.seed(123)
zFun1 <- function(time) cbind(time / 10 + 1, as.numeric(time > 1), 0.5)
simEventData(z = zFun1, zCoef = c(0.1, 0.5, - 0.5))
```

```
## ID time event origin X.1 X.2 X.3
## 1 1 0.971 1 0 1.10 0 0.5
## 2 1 1.880 1 0 1.19 1 0.5
## 3 1 1.902 1 0 1.19 1 0.5
## 4 1 1.940 1 0 1.19 1 0.5
## 5 1 2.156 1 0 1.22 1 0.5
## 6 1 2.371 1 0 1.24 1 0.5
## 7 1 2.471 1 0 1.25 1 0.5
## 8 1 3.000 0 0 1.30 1 0.5
```

In the example given above, the covariate vector is \(\mathbf{z}(t)=(0.1t+1, \boldsymbol{1}(t > 1),
0.5)^{\top}\). If the covariate function has more arguments, we
may specify them by a named list in `arguments`

. The example
given below produces the equivalent results.

```
set.seed(123)
zFun2 <- function(x, a, b) cbind(x / 10 + a, as.numeric(x > b), 0.5)
simEventData(z = zFun2, zCoef = c(0.1, 0.5, - 0.5),
arguments = list(z = list(a = 1, b = 1)))
```

```
## ID time event origin X.1 X.2 X.3
## 1 1 0.971 1 0 1.10 0 0.5
## 2 1 1.880 1 0 1.19 1 0.5
## 3 1 1.902 1 0 1.19 1 0.5
## 4 1 1.940 1 0 1.19 1 0.5
## 5 1 2.156 1 0 1.22 1 0.5
## 6 1 2.371 1 0 1.24 1 0.5
## 7 1 2.471 1 0 1.25 1 0.5
## 8 1 3.000 0 0 1.30 1 0.5
```

Notice that in the examples given above, if we generate event times
for more than one process, the time-varying covariate function will
remain the same for different processes, which may not be the case in
practice. A more realistic situation is that the time-variant covariate
functions are different among different processes but coming from a
common function family. Let’s consider the Stanford heart transplant
data (Crowley and Hu 1977) as an example
(the `heart`

data available in the **survival**
package). The covariate `transplant`

indicating whether the
patient has already received a heart transplant before time
`t`

is time-dependent and can be represented by a indicator
function family, \(\boldsymbol{1}(t >
b)\), where \(b\) is a known
parameter that may differ among patients. For a particular patient \(i\), \(b =
b_i\) is a known constant. In that case, we specify the function
parameters inside the `quote`

function as follows:

```
zFun3 <- function(time, a, b) cbind(time / 10 + a, as.numeric(time > b))
(simDat <- simEventData(nProcess = 3, z = zFun3, zCoef = c(- 0.1, 0.5),
arguments = list(z = list(a = quote(rpois(1, 10) / 10),
b = quote(runif(1, 1, 3))))))
```

```
## ID time event origin X.1 X.2
## 1 1 0.104 1 0 1.710 0
## 2 1 0.328 1 0 1.733 0
## 3 1 1.335 1 0 1.833 0
## 4 1 3.000 0 0 2.000 1
## 5 2 0.583 1 0 1.258 0
## 6 2 2.327 1 0 1.433 1
## 7 2 3.000 0 0 1.500 1
## 8 3 0.804 1 0 0.680 0
## 9 3 0.838 1 0 0.684 0
## 10 3 1.472 1 0 0.747 0
## 11 3 3.000 0 0 0.900 1
```

In the example given above, the covariate `X.2`

is
simulated from the indicator function famliy, \(\boldsymbol{1}(t > b)\), where parameter
\(b\) follows uniform distribution
between 1 and 3. Internally, the parameters specified in
`arguments`

were evaluated for each process. We may check the
values of the parameter `a`

from the generated covariate
`X.1`

for different processes as follows:

```
## check the values of parameter `a` for different processes
with(simDat, unique(cbind(ID, a = X.1 - time / 10)))
```

```
## ID a
## [1,] 1 1.7
## [2,] 2 1.2
## [3,] 3 0.6
## [4,] 3 0.6
```

The assumption of time-invariance on the covariate coefficients can
be hard to justify in practice. We may simulate event times with
time-varying covariate coefficients by specifying the argument
`zCoef`

to be a function of time that returns a vector of
coefficients at the input time point. How we may specify the argument
`zCoef`

for time-varying coefficients is very similar to the
way we may specify the argument `z`

for time-varying
covariates introduced in last section. For example, we may generate
event times with both covariates and their coefficients being
time-variant as follows:

```
zCoefFun <- function(time, shift) cbind(sqrt(time / 10), sin(time + shift), 0.1)
simEventData(z = zFun1, zCoef = zCoefFun,
arguments = list(zCoef = list(shift = 1)))
```

```
## ID time event origin X.1 X.2 X.3
## 1 1 1.12 1 0 1.11 1 0.5
## 2 1 1.34 1 0 1.13 1 0.5
## 3 1 2.31 1 0 1.23 1 0.5
## 4 1 2.43 1 0 1.24 1 0.5
## 5 1 3.00 0 0 1.30 1 0.5
```

As demonstrated in the example given above, we may similarly specify
the other arguments of the time-varying coefficient function via a named
list in `arguments`

.

Let’s consider frailty factors for individual processes first, where
each process \(i\) has its own frailty
effect \(w_i\). A popular choice of the
frailty distribution is the one-parameter gamma distribution of mean
one, which often leads to an explicit marginal likelihood in a
relatively simple expression. Similar to the argument `z`

,
`zCoef`

, and `rho`

, the argument
`frailty`

may take a function as input for simulating the
frailty effect. For example, we may simulate the recurrent event times
for one process with frailty factor following gamma(2, 0.5) via
`frailty = "rgamma"`

and
`arguments = list(frailty = list(shape = 2, scale = 0.5))`

as
follows:

```
set.seed(123)
simEventData(z = zFun1, zCoef = c(0.1, 0.5, - 0.5), frailty = stats::rgamma,
arguments = list(frailty = list(shape = 2, scale = 0.5)))
```

```
## ID time event origin X.1 X.2 X.3
## 1 1 0.135 1 0 1.01 0 0.5
## 2 1 0.620 1 0 1.06 0 0.5
## 3 1 1.102 1 0 1.11 1 0.5
## 4 1 1.324 1 0 1.13 1 0.5
## 5 1 3.000 0 0 1.30 1 0.5
```

The named list `list(shape = 2, scale = 0.5)`

was passed
to the function `rgamma`

(from the **stats**
package). The random number seed was reset so that we might compare the
results with the first example of time-variant covariates. Note that it
is users’ job to make sure the specified distribution of the frailty
factor has mean one (or makes sense in a certain way). The function
`simEvent()`

and `simEventData()`

only check the
sign of the simulated frailty effect. An error will be thrown out if the
generated frailty effect is not positive.

To demonstrate how to specify other distribution of the frailty factor, we simulate the recurrent event times for one process by three slightly different but equivalent approaches in the following examples, where the frailty effect follows log-normal distribution of mean one.

```
set.seed(123)
## use function `rlnorm` from the stats package
simEvent(z = zFun1, zCoef = c(0.1, 0.5, - 0.5), frailty = stats::rlnorm,
arguments = list(frailty = list(sdlog = 1)))
```

```
## 'simEvent' S4 class object:
## [1] 1.59 1.63 1.70 2.08 2.45 2.63
```

```
set.seed(123)
## use a customized function with argument `n` and `sdlog`
logNorm1 <- function(n, sdlog) exp(rnorm(n = n, mean = 0, sd = sdlog))
simEvent(z = zFun1, zCoef = c(0.1, 0.5, - 0.5), frailty = logNorm1,
arguments = list(frailty = list(sdlog = 1)))
```

```
## 'simEvent' S4 class object:
## [1] 1.59 1.63 1.70 2.08 2.45 2.63
```

```
set.seed(123)
## use a customized function with argument `sdlog` only
logNorm2 <- function(sdlog) exp(rnorm(n = 1, mean = 0, sd = sdlog))
simEvent(z = zFun1, zCoef = c(0.1, 0.5, - 0.5), frailty = logNorm2,
arguments = list(frailty = list(sdlog = 1)))
```

```
## 'simEvent' S4 class object:
## [1] 1.59 1.63 1.70 2.08 2.45 2.63
```

If the function specified for `frailty`

has an argument
named `n`

, that `n = 1`

will be specified
internally by the function `simEvent()`

, which is designed
for using the functions generating random numbers, such as
`rgamma()`

and `rlnorm()`

from the
**stats** package.

In this section, we present examples for generating event times with time-invariant covariates for several common parametric survival models.

The Weibull model is one of the most widely used parametric survival models. Assume the event times of the process \(i\), \(i\in\{1,\ldots,n\}\), follow Weibull model with hazard function \(h_i(t) = \lambda_i p t^{p-1}\), where \(p>0\) is the shape parameter, and \(\lambda_i\) can be reparametrized with regression coefficients. When \(p=1\), the Weibull model reduces to the exponential model, whose hazard rate is a constant over time.

One common reparametrization is \(\lambda_i = \exp(\beta_0 + \boldsymbol{\beta}^{\top}\mathbf{z}_i)\), which results in Weibull proportional hazard (PH) model. Let \(\lambda_0 = \exp(\beta_0)\) and we may rewrite the hazard function \(h_i(t) = \rho(t) \exp(\boldsymbol{\beta}^{\top}\mathbf{z}_i)\), where \(\rho(t) = \lambda_0 p t^{p-1}\) is the baseline hazard function. For example, we may simulate the survival data of ten processes from Weibull PH model as follows:

```
nProcess <- 10
rho_weibull_ph <- function(x, lambda, p) lambda * p * x ^ (p - 1)
simEventData(z = cbind(rnorm(nProcess), rbinom(nProcess, 1, 0.5)),
zCoef = c(0.5, 0.2), endTime = rnorm(nProcess, 10),
recurrent = FALSE, rho = rho_weibull_ph,
arguments = list(rho = list(lambda = 0.01, p = 2)))
```

```
## ID time event origin X.1 X.2
## 1 1 7.48 1 0 0.887 1
## 2 2 4.24 1 0 -0.151 1
## 3 3 6.81 1 0 0.330 0
## 4 4 10.59 1 0 -3.227 0
## 5 5 4.73 1 0 -0.772 1
## 6 6 10.28 0 0 0.287 0
## 7 7 6.93 1 0 -1.221 1
## 8 8 4.35 1 0 0.435 0
## 9 9 3.78 1 0 0.800 0
## 10 10 8.93 0 0 -0.164 1
```

The baseline hazard function of the gompertz model is \(\rho(t) = \lambda\exp(\alpha t)\), where \(\lambda > 0\) is the scale parameter and \(\alpha\) is the shape paramter. So the logrithm of the baseline hazard function is linear in time \(t\).

Similar to the example for the Weibull model given in last section, we may simulate the survival data of ten processes as follows:

```
rho_gompertz <- function(time, lambda, alpha) lambda * exp(alpha * time)
simEventData(z = cbind(rnorm(nProcess), rbinom(nProcess, 1, 0.5)),
zCoef = c(0.5, 0.2), endTime = rnorm(nProcess, 10),
recurrent = FALSE, rho = rho_gompertz,
arguments = list(rho = list(lambda = 0.1, alpha = 0.1)))
```

```
## ID time event origin X.1 X.2
## 1 1 8.95 0 0 -1.462 1
## 2 2 8.74 0 0 0.688 0
## 3 3 4.92 1 0 2.100 0
## 4 4 9.43 1 0 -1.287 1
## 5 5 5.41 1 0 0.788 1
## 6 6 3.35 1 0 0.769 0
## 7 7 7.29 1 0 0.332 0
## 8 8 4.48 1 0 -1.008 1
## 9 9 7.40 1 0 -0.119 1
## 10 10 9.78 0 0 -0.280 0
```

As discussed in Section accelerated failure time model, the hazard function of the log-logistic model is \[\lambda(t \mid \mathbf{z}_i) = \frac{p (t\lambda_{\mathbf{z}})^p/t}{1 + (t\lambda_{\mathbf{z}})^p}.\] So we may simulate the survival data of ten processes from the log-logistic model as follows:

```
rho_loglogistic <- function(time, z, zCoef, p) {
lambda <- 1 / parametrize(z, zCoef, FUN = "exponential")
lambda * p * (lambda * time) ^ (p - 1) / (1 + (lambda * time) ^ p)
}
simEventData(z = cbind(1, rnorm(nProcess), rbinom(nProcess, 1, 0.5)),
zCoef = c(0.3, 0.5, 0.2), end = rnorm(nProcess, 10),
recurrent = FALSE, relativeRisk = "none", rho = rho_loglogistic,
arguments = list(rho = list(p = 1.5)))
```

```
## ID time event origin X.1 X.2 X.3
## 1 1 0.206 1 0 1 0.9625 1
## 2 2 1.350 1 0 1 0.6843 0
## 3 3 1.972 1 0 1 -1.3953 1
## 4 4 3.480 1 0 1 0.8496 1
## 5 5 0.834 1 0 1 -0.4466 0
## 6 6 11.054 0 0 1 0.1748 1
## 7 7 0.916 1 0 1 0.0746 1
## 8 8 1.664 1 0 1 0.4282 1
## 9 9 3.238 1 0 1 0.0247 1
## 10 10 0.279 1 0 1 -1.6675 1
```

Notice that in the function `rho_loglogistic()`

for the
hazard function of the log-logistic model, we wrapped the
parametrization of the covariates and covariate coefficients with the
function `parametrize`

and specified that
`FUN = "exponential"`

. In addition, we specified
`relativeRisk = "none"`

when calling
`simEventData()`

for AFT models.

By following the discussion given in Section accelerated failure time model, it is not hard to obtain the hazard function of the log-normal model, \[\lambda(t \mid \mathbf{z}_i) = \frac{p}{t} \rho\bigl(p(\log(t) - \boldsymbol{\beta}^{\top}\mathbf{z}_i)\bigr),\] where \(\rho(w) = \phi(w) / \bigl(1 - \Phi(w)\bigr)\), \(\phi(w)\) and \(\Phi(w)\) is the density and cumulative distribution function of random variable following standard normal distribution.

The example of simulating survival times of ten processes from the log-normal model is given as follows:

```
rho_lognormal <- function(time, z, zCoef, p) {
foo <- function(x) dnorm(x) / pnorm(x, lower.tail = FALSE)
alpha <- parametrize(z, zCoef, FUN = "linear") - 1
w <- p * (log(time) - alpha)
foo(w) * p / time
}
simEventData(z = cbind(1, rnorm(nProcess), rbinom(nProcess, 1, 0.5)),
zCoef = c(0.3, 0.5, 0.2), end = rnorm(nProcess, 10),
recurrent = FALSE, relativeRisk = "none",
rho = rho_lognormal, method = "inversion",
arguments = list(rho = list(p = 0.5)))
```

```
## ID time event origin X.1 X.2 X.3
## 1 1 0.0907 1 0 1 0.433 1
## 2 2 3.7646 1 0 1 0.510 0
## 3 3 8.8247 0 0 1 1.196 1
## 4 4 0.1732 1 0 1 -1.084 0
## 5 5 0.2221 1 0 1 -1.145 1
## 6 6 8.8738 0 0 1 0.155 1
## 7 7 2.6047 1 0 1 -0.492 1
## 8 8 10.5684 0 0 1 2.367 1
## 9 9 0.1656 1 0 1 -1.434 1
## 10 10 0.2125 1 0 1 -1.611 0
```

Notice that the time origin was set to be zero by default. So the
time variable \(t\) in the denominator
of the hazard function \(\lambda(t \mid
\mathbf{z}_i)\), may result in undefined value when \(t=0\). Therefore, we specified
`method = "inversion"`

for the inversion sampling method for
the log-normal model.

Aalen, Odd, Ornulf Borgan, and Hakon Gjessing. 2008. *Survival and
Event History Analysis: A Process Point of View*.
Springer Science & Business Media.

Andersen, Per Kragh, and Richard David Gill. 1982.
“Cox’s Regression Model for Counting Processes:
A Large Sample Study.” *The Annals of
Statistics* 10 (4): 1100–1120.

Çınlar, Erhan. 1975. *Introduction to Stochastic Processes*.
Englewood Cliffs, NJ: Printice-Hall.

Cox, David R. 1972. “Regression Models and Life-Tables.”
*Journal of the Royal Statistical Society. Series B
(Methodological)* 34 (2): 187–220.

Crowley, John, and Marie Hu. 1977. “Covariance Analysis of Heart
Transplant Survival Data.” *Journal of the American
Statistical Association* 72 (357): 27–36.

Kalbfleisch, John D, and Ross L Prentice. 2002. *The Statistical
Analysis of Failure Time Data*. Vol. 360. John Wiley & Sons.

Kleinbaum, D G, and M Klein. 2011. *Survival Analysis: A
Self-Learning Text*. New York: Springer.

Lewis, P A, and G S Shedler. 1979. “Simulation of Nonhomogeneous
Poisson Processes by Thinning.” *Naval Research
Logistics Quarterly* 26 (3): 403–13.