README

Here is a trivial example. Everyone knows that a 3+3 design will escalate after it sees no toxicities in a cohort of three:

Having escalated through doses 1 and 2, here the design advocates treating another cohort at dose 3.

When it sees two-out-of-six toxicities at dose 3, it concludes that dose 3 is too toxic, dose 2 is the MTD, and the trial should stop.

escalation provides functions to use common dose-escalation methodologies like the continual reassessment method (CRM), the Bayesian optimal interval design (BOIN), the TPI suite of designs, efficacy-toxicity designs like EffTox or Wages & Tait, and (as we have seen) the perennial 3+3:

These functions create model fitting objects. Where possible, technical implementations are imported from existing R packages like dfcrm, trialr, and BOIN. Where no external implementations is available however, methods are implemented natively in escalation.

These dose-finding approaches can then be augmented with extra behaviours to specialise the dose selection process. For example, we can add behaviours to prevent skipping doses, or to stop when we reach a certain sample size. escalation supports the following behaviours:

Each of these functions overrides the way doses are selected or when a design decides to stop the trial. The behaviours can be flexibly combined using the %>% operator from the tidyverse suite.

These models are then fit to trial outcomes to produce dose recommendations. No matter how the dose selection behaviours were combined, the resulting model fits supports a standard interface. The two most important methods are recommended_dose() to get the current dose recommendation, and continue() to learn whether the design advocates continuing patient recruitment.

Having defined this nomenclature for combining dose selection behaviours and providing a standard interface for the resulting analyses, it is simple to run simulations or calculate dose-pathways for future cohorts of patients.

escalation provides an object-oriented approach to dose-escalation clinical trials in R. See Usage

Usage

Describing outcomes in dose-finding trials

escalation uses a succinct syntax for describing dose-finding outcomes, described in Brock (2019) for the phase I setting and in Brock et al. (2017) for the phase I/II setting.

In a joint phase I/II trial, like those supported by EffTox, where we have coincident efficacy and toxicity outcomes, those relevant letters are:

These outcome letters are strewn behind integer dose-levels to show the outcomes of patients in cohorts. To show that a cohort a three patients was given dose 2, that the first two patients were without toxicity, but the third patient experienced toxicity, we would use the outcome string:

If that cohort was followed by another cohort of three, all of which were without toxicity, the overall outcome string would be:

And so on. These strings are used in the escalate package to make it easy to fit models to observed outcomes. There are many examples below.

Dose selectors

A core class in the escalation package is the selector. It encapsulates the notion that a general dose-escalation design is able to recommend doses, keep track of how many patients have been treated at what doses, what toxicity outcomes have been seen, and whether a trial should continue. This general interface is true of model-based methods like the CRM and rule-based methods like the 3+3. Irrespective the particular approach used, the interface is consistent.

In this tutorial, we will demonstrate each of the types of selector implemented in the package and how they can be combined to tailor behaviour.

At the core of the dose selection process is an algorithm or a model that selects doses in responses to outcomes. The classes capable of performing this core role are:

Where indicated these methods rely on external packages. Otherwise, methods are implemented natively in escalation. We look at several of these below.

get_dfcrm

The continual reassessment method (O’Quigley, Pepe, and Fisher 1990) (CRM) is implemented in the dfcrm package by Cheung (2013). The very least information we need to provide is a dose-toxicity skeleton, and our target toxicity level. The skeleton represents our prior beliefs on the probabilities of toxicity at each of the doses under investigation. The model iteratively seeks a dose with toxicity probability close to the target.

The fit object will tell you the dose recommended by the CRM model to be administered next. Depending on your preference for classic R or tidyverse R, you might run:

Either way, you get the same answer. The model advocates skipping straight to dose 4. Clinicians are unlikely to feel comfortable with this. We can respecify the model to expressly not skip doses in escalation. We will do that later on.

For now, let us return to our model fit. We can ask whether the trial should keep going:

Naturally it wants to continue because dfcrm does not implement any stopping rules. Again, we will add various stopping behaviours in sections below.

The CRM-fitting function in dfcrm accepts many arguments to customise the model form and these are passed onwards by get_dfcrm function via the ... parameter. For example, to use the one-parameter logit model in dfcrm (rather than the default empiric model) with the intercept term fixed to take the value 4, we can specify:

intcpt and logistic are the parameter names chosen by the authors of dfcrm.

get_trialr_crm

We could instead fit the CRM models above using the trialr package by (Brock 2019, 2020).

Reusing the skeleton and target variables defined above, we fit the same empiric model

The dfcrm package, unless told otherwise, assumes that you want an empiric model where the prior variance for \(\beta\) is 1.34. In the trialr package, no such assumptions are made so we had to specify those variables.

All we have changed is the method of inference. dfcrm uses numerical integration to calculate posterior statistics and plugs those into the dose-toxicity function. In contrast, trialr fits the model using Hamiltonian MCMC sampling via Stan. Thankfully, the two models agree on the desired next dose:

The added bonus we get from the trialr fit, however, is those samples from the posterior distribution:

That facilitates really flexible inference. For example, what is the probability that toxicity at dose 3 is at least 5% greater than that at dose 2? Simple to answer using the posterior samples:

get_trialr_nbg

The two-parameter logistic dose-escalation method of Neuenschwander, Branson, and Gsponer (2008) (NBG) is implemented in the trialr package by Brock (2020).

The very least information we need to provide is a vector of the doses under investigation, a reference dose-level \(d^*\), our target toxicity level, and priors on the logit model intercept, \(\alpha\), and dose gradient, \(\beta\).

For illustration, let us reproduce the notorious example in Figure 1 of Neuenschwander, Branson, and Gsponer (2008) with 15 doses:

However, we see that it is a close call as to which dose is closest to the target toxicity level:

get_tpi

The Toxicity Probability Interval (TPI) method was introduced by Ji, Li, and Bekele (2007). The model requires a few parameters:

and the returned model fit obeys the same interface as the other classes described here. For instance, the dose recommended for the next cohort is:

get_mtpi

The Modified Toxicity Probability Interval (mTPI) method was introduced by Ji et al. (2010). It is generally simpler to implement than TPI because the \(\epsilon1\) and \(\epsilon2\) parameters have the intuitive interpretation of forming the bounds of the interval that we regard as containing doses equivalent to the target dose. For instance, if we target a dose with toxicity probability equal to 25%, but would judge doses in the region (20%, 30%) to be satisfactorily toxic, we run:

In this parameterisation, we exclude doses if we are 95% a-posteriori sure that the associated toxicity rate exceeds the target.

See the Modified Toxicity Probability Interval Design vignette for more information.

get_mtpi2

mTPI was further updated by Guo et al. (2017) to produce mTPI2. Its parameterisation is similar to mTPI.

Once again, see the Modified Toxicity Probability Interval Design vignette for more information.

get_boin

escalate also implements the Bayesian Optimal Interval (BOIN) dose-finding design by Liu and Yuan (2015) via the BOIN package (Yuan and Liu 2018).

In contrast to CRM, BOIN does not require a dose-toxicity skeleton. In its simplest case, it requires merely the number of doses under investigation and our target toxicity level:

The BOIN dose selector natively implements stopping rules, as described by Liu & Yuan. For instance, if the bottom dose is too toxic, the design will advise the trial halts:

This clarifies that no dose should be recommended for further study. In this setting, this is because all doses are considered too toxic. This is distinct from scenarios where a design advocates stopping a trial and recommending a dose for further study. We will encounter situations like that below.

Since escalation provides many flexible options for stopping, we have made it possible to suppress BOIN’s native stopping rule via use_stopping_rule = FALSE. In this instance, the user may want to add their own stopping rule, e.g. using stop_when_too_toxic.

Extra parameters are passed to the get.boundary function in the BOIN package to customise the escalation procedure. For instance, the boundaries that guide changes in dose are set to be 60% and 140% of the target toxicity rate, by default. To instead use 30% and 170%, we could run:

To observe the effect of the change, note that the default values suppress escalation in this scenario:

The parameter names p.saf and p.tox were chosen by the authors of the BOIN package.

get_boin12

escalation also supports designs that choose doses according to co-primary efficacy and toxicity outcomes, such as the BOIN12 design (Lin et al. 2020).

We provide target toxicity and efficacy thresholds via phi_t and phi_e, the utility of ‘no efficacy and no toxicity’ via u2, and the utility of ‘efficacy with toxicity’ via u3 on a (0, 100) scale:

In contrast to the examples above, outcomes now include efficacy as well as toxicity. E reflects efficacy only, and B reflects both efficacy and toxicity. The model-fitting process is largely the same though:

escalation also supports other so-called phase I/II designs that select doses by efficacy and toxicity like Wages and Tait (2015) and Thall and Cook (2004). See the help pages for more information.

get_three_plus_three

The 3+3 method is an old method for dose-escalation that uses fixed cohorts of three and pre-specified rules to govern dose-selection (Korn et al. 1994; Le Tourneau, Lee, and Siu 2009).

To create a 3+3 design, we need no more information than the number of doses under investigation:

Korn et al. (1994) described a variant of 3+3 that permits deescalation to ensure that six patients are treated at a dose before it is recommended. To use that option in our model, we could have run:

The model would then advocate deescalation if at least two toxicities are seen at a dose and the dose below has fewer than 6 treated patients:

follow_path

The final dose selector in this section is not really a model at all, so much as a pre-specified path to follow. Let us say that we would like to escalate through the doses in the absence of toxicity, treating two patients at each of the first two doses, and three at the other doses. We can specify such a path in escalation using:

When the outcomes diverge from the pre-specified path, however, this selector does not know what to do:

That rather seems to limit its value. The point of this class is that we sometimes want to specify what is occasionally referred to as an initial escalation plan. When trial outcomes diverge from the initial plan, another method takes over. This is a perfect opportunity to show how different selectors can be joined together. Let us say that we wish to follow the initial plan described above, but when the first toxicity event is seen, we want a CRM model to take over. We simply join the functions together using the pipe operator from magrittr:

Now, when trial outcomes diverge from the path, the CRM model analyses all of the outcomes and recommends the next dose:

This concludes our look at the core dose-selecting classes. We now turn our attention to the ways in which these methods can be adapted using extra behaviours.

dont_skip_doses

We saw in the CRM example above that the design undesirably wanted to skip straight to a high dose, without trying some of the lower doses. A simple and very common constraint to impose in dose-finding trials is to avoid skipping untested doses.

Resuming our CRM example, we suppress the skipping of untested doses in escalation with:

This time, however, the model advocates dose 3. Previously, it wanted to go straight to dose 4.

We prevented skipping dose in escalation. We could have prevented skipping doses in deescalation with:

stop_at_n

Let us now investigate some methods that facilitate stopping. The simplest condition on which to stop is when the total sample size reaches some pre-specified level. For instance, we might want to treat a maximum of 15 patients and then stop. To do this, we call the stop_at_n function and append it onto the end of a core dose selector, like this:

When this design has seen fewer than 15 patients, it will select doses and advocate that the trial continues. For instance:

the design advocates stopping. It is important to note that, even though the design has stopped, it still recommends that a dose be studied at the next trial phase:

This is in contrast to the scenario where a trial is stopped because all doses are inappropriate. In this scenario, the dose recommendation would be NA. We will encounter this in examples below.

stop_when_n_at_dose

Another common approach is to stop a dose-finding experiment when a given number of patients have been treated at a particular dose.

Continuing with our CRM model, to stop when nine patients have been treated at the dose that is about to be recommended again, we use:

We can observe how this alters the dose-selection model. Here we see six patients treated at dose 2:

If the next cohort results in dose 2 being recommended yet again, i.e. to bring the total number of patients at dose 2 to nine or more, the model stops:

In this scenario, dose 2 is the final recommended dose and the trial stops gracefully at a pre-specified stopping rule.

This behaviour can also be configured to stop when any dose has been given n times:

Naturally, you can combine this behaviour with other behaviours. The following model stops the trial when nine patients have been evaluated at the recommended dose or when 21 patients have been treated in total, whichever occurs first:

stop_when_ci_covered

The two stopping mechanisms above scrutinise the number of patients treated. In many situations, this will be valuable. However, in other situations, we might want to stop when a threshold amount of statistical information is obtained. One way to achieve this is to stop when the confidence interval or credible interval for the probability of toxicity at a dose is covered by a specified range.

For instance, we know that the BOIN design seeks a target toxicity level, and we have used a target of 25% in our examples. We might say that we are sure enough about the recommended dose when the associated 90% credible interval (because BOIN is a Bayesian design) of the toxicity probability falls in the region 10% - 40%.

This is because the lower bound of the 90% interval for the probability of toxicity at dose 2 is at least 10%:

It may be intersting to note that our CRM model would not stop in this scenario:

This is because the lower bound of the 90% CI falls slightly outside the sought range:

As before, we can specify dose = 'recommended', dose = 'any', or a particular numerical dose-level with dose = 3, for example.

It should be appreciated that this approach only works when the underlying model extends a way of calculating quantiles and uncertainty intervals. The 3+3 lacks a statistical foundation and does not offer quantiles:

stop_when_too_toxic

The stopping rules considered so far stop a trial and recommend a dose once some critical threshold of information is obtained. We will naturaly want to stop if all doses are too toxic.

We saw above that some model-based dose-finding approaches can calculate quantiles. We can take this idea further and advocate stopping when there is sufficient evidence that the toxicity probability at some dose exceeds a critical threshold. In such circumstances, no dose will be recommended because all doses of the treatment will be deemed to be excessively toxic.

Let us set up a rule to stop and recommend no dose if the probability of toxicity at the lowest dose is too high:

The above examples stops when 70% of the probability mass or posterior distribution of the probability of toxicity at dose 1 exceeds 35%. With an isolated toxicity incidence at dose 1, the model advocates continuing at dose 1:

This is because the probability that the toxicity rate exceeds 35% is less than 70%:

However, with material additional toxicity at dose 1, the design now advocates stopping:

Once again, we can specify dose = 'recommended', dose = 'any', or a particular numerical dose-level with dose = 3, for example. We also require that the underlying model supports the calculation of quantiles. BOIN supports this fucntionality:

demand_n_at_dose

We have looked at many behaviours that provide stopping. We can also look at some behaviours that delay stopping.

We might want to guarantee that we treat at least n patients at a dose before we permit a dose-finding trial to stop. For instance, we might not feel comfortable recommending a dose for the next phase of study if it has only been evaluated in a small number of patients.

It makes sense for this behaviour to be used with a design that would otherwise stop. Let us say that we would normally like to stop after 18 patients have been treated. However, we will also demand that at least 6 patients be treated at the recommended dose before stopping is allowed, irrespective the overall sample size. We specify:

the design advocates continuing at dose 2 even though 18 patients have been evaluated. This is because the demand_n_at_dose function is overriding the stopping behaviour of stop_at_n. It is requesting that the trial continue at dose 2 instead of stopping with only three patients treated at the nominal recommended dose.

It is important to recognise that the order of the functions matters. If we flip the order of the constraints in the example above, the outcome is different:

Now the stop_at_n constraint overrides the action of demand_n_at_dose to halt the trial when n=18, even though only three patients have been evaluated at dose 2. It overrides because it comes later in the decision chain. Users should be aware that commands that come later take precedence.

Once again, we can specify dose = 'recommended', dose = 'any', or a particular numerical dose-level with dose = 3, for example.

In summary, the demand_n_at_dose function delays stopping in a scenario when a dose is being selected.

try_rescue_dose

In contrast to demand_n_at_dose, the try_rescue_dose function delays stopping in a scenario where no dose is going to be selected. It overrides a decision to stop and recommend no dose when fewer than n patients have been evaluated at a given dose. Thus, it provides a facility to ensure that some “rescue” dose has been tried before stopping is allowed.

This is another function where effective demonstration requires a design that would normally stop. Let us say that we will stop if we are 80% sure that the toxicity rate at the lowest dose exceeds 35%. But before we stop, we want to ensure that at least two patients have been evaluated at the lowest dose. We write:

the design will not advocate stopping, even though the posterior confidence that the tox rate at dose 1 exceeds 35% is greater than 80%:

Once two patients are seen at dose 1, stopping can be countenanced. If those two patients tolerate treatment at dose 1:

then stopping is not advocated because the posterior belief is now that dose 1 is not excessively toxic:

The try_rescue_dose function allows researchers to rescue situations where otherwise sensible stopping criteria may prove too sensitive to chance events in very small sample sizes.

select_dose_by_cibp

This function implements the convex infinite bounds penalisation (CIBP) criterion of Mozgunov and Jaki (2020) that adjusts the way doses are selected in CRM trials. Their method is mindful of the uncertainty in the estimates of the probability of toxicity and uses an asymmetry parameter, 0 < a < 2, to penalise escalation to risky doses. The method alters the way doses are selected but not when the trial should stop. For a < 1, the criterion penalises toxic doses more heavily, making escalation decisions more conservative.

Simulation and dose-paths

We have described at length above the flexible methods that escalation provides to specify dose-escalation designs and tailor trial behaviour. Once designs are specified, we can investigate their operating characteristics by simulation using the simulate_trials function, and efficiently compare designs using Sweeting et al. (2023)’s method in simulate_compare. We can also exhaustively calculate dose recommendations for future cohorts using the get_dose_paths function. Both of these topics are the topics of full vignettes so please check them out.

Installation

Future Plans

I plan to add model-fitting functions for EWOC via ewoc, further methods for phase I/II designs, and perhaps also methods for dual agents.

I want to investigate adding some further stopping functions like those researched by Zohar and Chevret (2001).

Finally, I will investigate adding time-to-event versions of the designs presented here, the so-called TITE designs. These will require a different approach to simulation because cohorts no longer apply.

Getting help

This package is still in active development. There are thousands of unit tests run each time the package code is updated. However, that certainly does not mean that the code is bug free. You should always be on the defensive. This software is offered with no guarantee at all. If you have found a bug, please drop me a line and also log it here:

References

Brock, Kristian. 2019. “trialr: Bayesian Clinical Trial Designs in R and Stan.” arXiv e-Prints, June, arXiv:1907.00161. https://arxiv.org/abs/1907.00161.

———. 2020. Trialr: Clinical Trial Designs in ’Rstan’. https://cran.r-project.org/package=trialr.

Brock, Kristian, Lucinda Billingham, Mhairi Copland, Shamyla Siddique, Mirjana Sirovica, and Christina Yap. 2017. “Implementing the EffTox Dose-Finding Design in the Matchpoint Trial.” BMC Medical Research Methodology 17 (1): 112. https://doi.org/10.1186/s12874-017-0381-x.

Cheung, Ken. 2013. Dfcrm: Dose-Finding by the Continual Reassessment Method. https://CRAN.R-project.org/package=dfcrm.

Guo, Wentian, Sue-Jane Wang, Shengjie Yang, Henry Lynn, and Yuan Ji. 2017. “A Bayesian Interval Dose-Finding Design addressingOckham’s Razor: mTPI-2.” Contemporary Clinical Trials 58: 23–33.

Ji, Yuan, Yisheng Li, and B. Nebiyou Bekele. 2007. “Dose-finding in phase I clinical trials based on toxicity probability intervals.” Clinical Trials 4 (3): 235–44. https://doi.org/10.1177/1740774507079442.

Ji, Yuan, Ping Liu, Yisheng Li, and B. Nebiyou Bekele. 2010. “A modified toxicity probability interval method for dose-finding trials.” Clinical Trials 7 (6): 653–63. https://doi.org/10.1177/1740774510382799.

Korn, Edward L., Douglas Midthune, T. Timothy Chen, Lawrence V. Rubinstein, Michaele C. Christian, and Richard M. Simon. 1994. “A Comparison of Two Phase I Trial Designs.” Statistics in Medicine 13 (18): 1799–1806. https://doi.org/10.1002/sim.4780131802.

Le Tourneau, Christophe, J. Jack Lee, and Lillian L. Siu. 2009. “Dose Escalation Methods in Phase i Cancer Clinical Trials.” Journal of the National Cancer Institute 101 (10): 708–20. https://doi.org/10.1093/jnci/djp079.

Lin, Ruitao, Yanhong Zhou, Fangrong Yan, Daniel Li, and Ying Yuan. 2020. “BOIN12: Bayesian Optimal Interval Phase i/II Trial Design for Utility-Based Dose Finding in Immunotherapy and Targeted Therapies.” JCO Precision Oncology 4: 1393–1402.

Liu, Suyu, and Ying Yuan. 2015. “Bayesian Optimal Interval Designs for Phase I Clinical Trials.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 64 (3): 507–23. https://doi.org/10.1111/rssc.12089.

Mozgunov, Pavel, and Thomas Jaki. 2020. “Improving Safety of the Continual Reassessment Method via a Modified Allocation Rule.” Statistics in Medicine 39 (7): 906–22. https://doi.org/10.1002/sim.8450.

Neuenschwander, Beat, Michael Branson, and Thomas Gsponer. 2008. “Critical aspects of the Bayesian approach to phase I cancer trials.” Statistics in Medicine 27: 2420–39. https://doi.org/10.1002/sim.3230.

O’Quigley, J, M Pepe, and L Fisher. 1990. “Continual Reassessment Method: A Practical Design for Phase 1 Clinical Trials in Cancer.” Biometrics 46 (1): 33–48. https://doi.org/10.2307/2531628.

Sweeting, Michael, Daniel Slade, Daniel Jackson, and Kristian Brock. 2023. “Potential Outcome Simulation for Efficient Head-to-Head Comparison of Adaptive Dose-Finding Designs.” Preprint.

Thall, Peter F, and John D Cook. 2004. “Dose-Finding Based on Efficacy–Toxicity Trade-Offs.” Biometrics 60 (3): 684–93.

Wages, Nolan A, and Christopher Tait. 2015. “Seamless Phase i/II Adaptive Design for Oncology Trials of Molecularly Targeted Agents.” Journal of Biopharmaceutical Statistics 25 (5): 903–20.

Yuan, Ying, and Suyu Liu. 2018. BOIN: Bayesian Optimal INterval (BOIN) Design for Single-Agent and Drug- Combination Phase i Clinical Trials. https://CRAN.R-project.org/package=BOIN.

Zohar, Sarah, and Sylvie Chevret. 2001. “The Continual Reassessment Method: Comparison of Bayesian Stopping Rules for Dose-Ranging Studies.” Statistics in Medicine 20 (19): 2827–43. https://doi.org/10.1002/sim.920.

escalation

Overview