- Preliminaries
- For which types of study designs can I use MCP-Mod?
- What is the difference between the original and generalized MCP-Mod, and what type of response can generalized MCP-Mod handle?
- How many doses do we need to perform MCP-Mod?
- How to determine the doses to be used for a trial using MCP-Mod?
- How to set up the candidate set of models?
- Can MCP-Mod be used in trials without placebo control?
- Why are bounds used for the nonlinear parameters in the fitMod function?
- Should model-selection or model-averaging be used for analysis?
- Which model selection criterion should be used?
- How to deal with intercurrent events and missing data?
- Can MCP-Mod be used in trials with multiple treatment regimens?
- What about dose-response estimates, when the MCP part was (or some of the model shapes were) not significant?
- References

The purpose of this FAQ document is to provide answers to some commonly asked questions, based on personal opinions and experiences. For an introduction to MCP-Mod please see Bretz et al. (2005) and Pinheiro et al. (2014).

MCP-Mod has been developed with having efficacy dose-finding studies in mind, as they are performed in Phase 2 of clinical drug-development. Typically these studies are large scale parallel group randomized studies (e.g. from around 50 to almost 1000 patients in total). It is also possible to use MCP-Mod in crossover designs using generalized MCP-Mod (see below).

Titration designs are out of scope, because the administered dose levels depend on observed responses in the same patients, thereby making any naïve dose-response modelling inappropriate.

Phase 1 dose escalation safety studies are also out of scope. The major question is dose selection for the next cohort during the trial, and tools have been developed specifically for this purpose. In addition assessment of a dose-response signal over placebo is not so much of interest in these studies.

The original MCP-Mod approach was derived for a normally distributed response variable assuming homoscedasticity across doses. The generalized MCP-Mod approach (Pinheiro et al. 2014) is a flexible extension that allows for example for binary, count, continuous or time-to-event outcomes.

In both variants one tests and estimates the dose-response relationship among \(K\) doses \(x_1,\dots,x_K\) utilizing \(M\) candidate models given by functions \(f_m(x_k, \theta_m)\).

The original MCP-Mod approach assumes normally distributed observations \[ y_{k,j} \sim \mathrm{Normal}(\mu_k, \sigma^2) \] for \(k=1,\dots,K\) and \(j=1,\dots,n_k\) in each group, where \(\mu_k = f_m(x_k, \theta_m)\) under the \(m\)-th candidate model. In the MCP part the null hypothesis of a flat response profile \(c_m^T \mu = 0\) vs \(c_m^T \mu > 0\) (or \(\neq 0\)) is tested with \(c_m\) chosen to maximize power under the \(m\)-th candidate model. Critical values are taken from the multivariate t distribution with \((\sum_{k=1}^K n_k) - k\) degrees of freedom. In the Mod part the dose-response model parameters \(\theta\) are estimated by OLS, minimizing \(\sum_{k,j} (y_{k,j} - f_m(x_{k,j}, \theta))^2\).

In the generalized MCP-Mod approach no specific type of distribution is assumed for the observations, \[ y_{k,j} \sim \mathrm{SomeDistribution}(\mu_k), \] only that \(\mu_k\) can be interpreted as a kind of “average response” for dose \(k\). The key assumption is that an estimator \(\hat\mu=(\hat\mu_1,\dots,\hat\mu_k)\) exists, which has (at least asymptotically) a multivariate normal distribution, \[ \hat\mu \sim \mathrm{MultivariateNormal}(\mu, S), \] and that a first-stage fitting procedure can provide estimates \(\hat\mu\) and \(\hat S\). The \(m\)-th candidate model is taken to imply \(\mu_k = f_m(x_k, \theta)\) and the null hypothesis \(c_m^T \mu = 0\) is tested with optimal contrasts. The estimate \(\hat S\) is used in place of the unknown \(S\), and critical values are taken from the multivariate normal distribution. Alternatively, degrees of freedom for a multivariate t distribution can be specified. For the Mod part the model parameters \(\theta\) are estimated with GLS by minimizing \[ (\hat\mu - f_m(x, \theta))^T\hat{S}^{-1}(\hat\mu - f_m(x, \theta)). \]

In generalized MCP-Mod with an ANOVA as the first stage (based on an normality assumption), the multiple contrast test (with appropriate degrees of freedom) will provide the same result as the original MCP-Mod approach.

In summary generalized MCP-Mod is a two-stage approach, where in the first stage a model is fitted, that allows to extract (covariate adjusted) estimates at each dose level, as well as an associated covariance matrix. Then in a second stage MCP-Mod is performed on these summary estimates in many ways similar as the original MCP-Mod approach.

We discuss the situation when the first stage fit is a logistic
regression in this vignette, but many
other first stage models could be used, as long as the first fit is able
to produce adjusted estimates at the doses as long as the associated
covariance matrix. See also the help page of the neurodeg data set
`?neurodeg`

, for a different longitudinal example.

When using two active doses + placebo it is technically possible to
perform the MCP and Mod steps, but in particular for the Mod step only a
very limited set of dose-response models can be fitted. In addition
limited information on the dose-response curve can be obtained. For both
the MCP and the Mod step to make sense, three active doses and placebo
should be available, with the general recommendation to use 4-7 active
doses. When these doses cover the effective range well (i.e., increasing
part and plateau), a large number of active doses is unlikely to produce
a benefit, as the simulations in Bornkamp et al.
(2007) have also shown. Optimal
design calculations can also provide useful information on the number of
doses (and which doses) to use. From experience with optimal design
calculations for different candidate sets, the number of doses from an
optimal design calculation often tend to be smaller than 7 (see also
`?optDesign`

).

To gain most information on the compound, one should evaluate a dose-range that is as large as feasible in terms of lowest and highest dose. As a rule of thumb at minimum a dose-range of > 10-fold should be investigated (i.e., the ratio of highest versus lowest dose should be > 10).

Plasma drug exposure values (e.g., steady state AUC values) can be a good predictor of effect. In these situations one can try to select doses to achieve a uniform coverage of the exposure values. These exposure values per patient per dose often follow a log-normal distribution (i.e., positively skewed, with the variance increasing with the mean), so that the spaces between doses should get larger with increasing doses. Often log-spacing of doses (i.e., the ratio of consecutive doses is constant for example equal to 2 or 3) is used.

An alternative approach to calculate adequate doses is optimal design
theory (see `?optDesign`

). The idea is to calculate a design
(i.e. the doses and dose allocation weights) under a given fixed sample
size so that the variability of the dose-response parameter estimates
(or variance of some target dose estimate) is “small” in a specified way
(see Bretz et al.
2010).

Rule of thumb: 3 - 7 dose response shapes through 2 - 4 models are often sufficient. The multiple contrast test is quite robust, even if the model-shapes are mis-specified. What information to utilize?

It is possible to use **existing information**:

*Similar compounds:* Information might be available on the
dose-response curve for a similar compound in the same indication or the
same compound in a different indication.

*Other models:* A dose-exposure-response (PK/PD) model might
have been developed based on earlier data (e.g. data from the
proof-of-concept (PoC) study). This can be used to predict the
dose-response curve at a specific time-point.

*Emax model:* An Emax type model should always be included in
the candidate set of models. Meta-analyses of the dose-response curves
over the past years showed, that in many situations the monotonic
standard Emax model, or the sigmoid Emax model is able to describe the
data adequately (see Thomas et al. 2015; Thomas and Roy 2017).

There are also some **statistical considerations** to be
aware of:

*Small number of doses and model fitting:* If only a few
active doses are feasible to be used in a trial, it is difficult to fit
the more complex models, for example the sigmoid Emax or the beta model
with four parameters in a trial with three active doses. Such models
would not be included in the candidate set and one would rather use more
dose-response models with fewer parameters to obtain an adequate breadth
of the candidate set (such as the simple Emax, exponential or quadratic
model).

Some sigmoid Emax (or beta) model shapes cannot be approximated well
by these models. If one still would like to include for example a
sigmoid shape this can be achieved by fixing the Hill parameter to a
given value (for example 3 and/or 5), and then use different sigmoid
Emax candidate models with fixed Hill parameter also for model fitting.
Model fitting of these models can be performed with the standard Emax
model but utilizing \(doses^h\) instead
of \(doses\) as the dose variable,
where \(h\) is the assumed fixed Hill
parameter (note that the interpretation of ED50 parameter returned by
`fitMod`

then changes).

*Consequence of model misspecification:* Omission of the
“correct” dose-response shape from the set of candidate models might not
necessarily have severe consequences, if other models can pick up the
omitted shape. This can be evaluated for the MCP part (impact on power)
using explicit calculations (see Pinheiro et al.
(2006) and the vignette on sample size). For the Mod
part (impact on estimation precision for dose-response and dose
estimation) using simulations see `?planMod`

.

*Impact on sample size:* Using a very broad and flexible set
of candidate models does not come “for free”. Generally the critical
value for the MCP test will increase, if many different (uncorrelated)
candidate shapes are included, and consequently also the sample size.
The actual impact will have to be investigated on a case-by-case basis.
A similar trade-off exists in terms of dose-response model fitting (Mod
part), as a broader candidate set will decrease potential bias (in the
case of a mis-specified model) but increase the variance of the
estimates.

*Umbrella-shaped dose-response curve:* While biological
exposure-response relationships are often monotonic, down-turns of the
clinical dose-response relationship at higher doses have been observed.
For example if, due to tolerability issues, more patients will
discontinue treatment with higher doses of the drug. Depending on the
estimand strategy of handling this intercurrent event (e.g. for
treatment policy or composite) this might lead to a decrease in clinical
efficacy at higher doses. It is important to discuss the plausibility of
an umbrella-shaped dose-response stage at design stage and make a
decision on whether to include such a shape or not.

*Caution with linear models:* Based on simulation studies
utilizing the AIC, it has been observed that the linear model (as it has
fewest parameters) is often too strongly favored (with the BIC this
trend is even stronger), see also results in Schorning et al. (2016). The recommendation would be
to exclude the linear model usually from the candidate set. The Emax and
exponential model (and also the sigmoid Emax model) can approximate a
linear shape well in the limiting case.

In some cases the use of a placebo group is not possible due to ethical reasons (e.g., because good treatments exist already or the condition is very severe).

In such cases, the MCP part of MCP-Mod focuses on establishing a dose-response trend among the active doses, which would correspond to a very different question rather than a dose-response effect versus placebo, and may not necessarily be of interest.

The Mod step would be conducted to model the dose-response relationship among the active doses. Due to non-inclusion of a placebo group, this may be challenging to perform.

One aim of such a dose-finding trial could be to estimate the smallest dose of the new compound achieving the same treatment effect as the active control.

Most of the common dose-response models are nonlinear in the parameters. This means that iterative algorithms need to be used to calculate the parameter estimates. Given that the number of dose levels is usually relatively small and the noise relatively large in these studies, convergence often fails. This is usually due to the fact that the best fitting model shape corresponds to the case, where one of the model parameters is infinite or 0. When observing these cases more closely, one observes that while on the parameter scale no convergence is obtained, typically convergence towards a fixed model shape is obtained.

One approach to overcome this problem is to use bounds on the
nonlinear parameters for the model, which thus ensure existence of an
estimate. In many situations the assumed bounds can be justified in
terms of requiring that the shape-space underlying the corresponding
model is covered almost exhaustively (see the `defBnds`

function, for the proposed default bounds).

When utilizing bounds for model fitting, it bootstrapping/bagging can be used for estimation of the dose-response functions and for the confidence intervals, see Pinheiro et al. (2014). Standard asymptotic confidence intervals are not reliable.

The Mod step can be performed using either a single model selected from the initial candidate set or a weighted average of the candidate models. Model averaging has two main advantages

*Improved estimation performance:* Simulations in the
framework of dose-response analyses in Phase II have shown (over a range
of simulation scenarios) that model-averaging leads to a slightly better
performance in terms of dose-response estimation and dose-estimation
(see Schorning et
al. 2016).

*Improved coverage probability of confidence intervals:* Model
averaging techniques generally lead to a better performance in terms of
confidence interval coverage under model uncertainty (confidence
intervals are typically closer to their nominal level).

There are two main (non-Bayesian) ways of performing model averaging:

*Approximate Bayesian approach:* The models are weighted
according exp(-0.5*IC), where IC is an information criterion (e.g., AIC)
corresponding to the model under consideration. All subsequent
estimation for quantities of interest would then be based on a weighted
mean with the weights above. For numerical stability the minimum IC
across all models is typically subtracted from the IC for each model,
which does not change the model weights.

*Bagging:* One takes bootstrap samples, performs model
selection on each bootstrap re-sample (using, for example AIC) and then
uses the mean over all bootstrap predictions as the overall estimate
(see Breiman
1996). As the predictions typically come from different
models (for each bootstrap resample), this method can be considered to
be an “implicit” way of model averaging. Bagging has the advantage that
one automatically gets bootstrap confidence intervals for quantities of
interest (dose-response curve or target doses) from the performed
simulations.

Whether MCP-Mod is implemented using model selection or model averaging, a suitable model selection criterion needs to be specified. See Schorning et al. (2016) for a brief review of the mathematical background of different selection criteria. A simulation in this paper supports a recommendation to utilize the AIC criterion.

As in any other trial intercurrent events and handling strategies need to be identified, as well as missing data handling (see ICH E9(R1) guideline). In many situations (e.g. if multiple imputation is used as part of the analysis) it may be easiest to use generalized MCP-Mod, where the first stage model already accounts for intercurrent events and missing data. This model is then used to produce covariate adjusted estimates at the doses (as well as their covariance matrix), which are then utilized in generalized MCP-Mod.

Many of the dose-finding trials study not only multiple doses of one treatment regimen, but include more than one treatment regimen (e.g., once daily (od), twice daily (bid)). MCP-Mod is focused around assessing only one dose-response relationship, but can be extended to handle some of these cases, when one is willing to make additional assumptions.

Out of scope are situations, when the primary question of the trial is the regimen and not the dose, e.g., multiple regimen are employed but each with only one or two doses.

Out of scope are also situations when the different regimens differ substantially. For example in situations when some treatment groups include a loading dose others do not. In a naïve dose-response modelling approach the dosing regimen cannot be easily reduced to a single dose per patient and is inappropriate.

In scope are situations when the primary question focuses around the dose-response curve in the regimen. One possible assumption is to use a dose-response model on a common dose scale (e.g. daily dose) but then to assume that some of the parameters of the dose-response curves within the regimen are shared between regimen, while others are different (e.g. same or different E0, Emax, ED50 parameters between the regimen for an Emax dose-response model). See the vignette on this topic.

To be feasible this approach requires an adequate number of doses per regimen to be able to detect a dose-response signal in each regimen and to estimate the dose-response curve in each regimen. Whether or not simplifying assumptions of parameters shared between regimen are plausible depends on the specifics of every drug.

For practical reasons, the proposal is to perform the Mod step always with all specified models (even if all or only some of the dose-response models are not significant). The obtained dose-response estimate, however, needs to be interpreted very cautiously, when no overall dose-response trend has been established in the MCP step.

Using all models is advisible, because non-significance of a particular contrast may only have been due to a particular inadequate choice of guesstimates - nevertheless once the model parameters are estimated from the data in the Mod step, the model may fit the data adequately (if not it will be downweighted automatically by the AIC).

Bornkamp, B., Bretz, F., Dmitrienko, A., Enas, G., Gaydos, B., Hsu,
C.-H., König, F., Krams, M., Liu, Q., Neuenschwander, B., Parke, T.,
Pinheiro, J., Roy, A., Sax, R., and Shen, F. (2007), “Innovative
approaches for designing and analyzing adaptive dose-ranging
trials,” *Journal of Biopharmaceutical Statistics*, 17,
965–995. https://doi.org/10.1080/10543400701643848.

Breiman, L. (1996), “Baggin predictors,” *Machine
Learning*, 24, 123–140. https://doi.org/10.1007/bf00058655.

Bretz, F., Dette, H., and Pinheiro, J. (2010), “Practical
considerations for optimal designs in clinical dose finding
studies,” *Statistics in Medicine*, 29, 731–742. https://doi.org/10.1002/sim.3802.

Bretz, F., Pinheiro, J. C., and Branson, M. (2005), “Combining
multiple comparisons and modeling techniques in dose-response
studies,” *Biometrics*, Wiley Online Library, 61, 738–748.
https://doi.org/10.1111/j.1541-0420.2005.00344.x.

Pinheiro, J., Bornkamp, B., and Bretz, F. (2006), “Design and
analysis of dose finding studies combining multiple comparisons and
modeling procedures,” *Journal of Biopharmaceutical
Statistics*, 16, 639–656. https://doi.org/10.1080/10543400600860428.

Pinheiro, J., Bornkamp, B., Glimm, E., and Bretz, F. (2014),
“Model-based dose finding under model uncertainty using general
parametric models,” *Statistics in Medicine*, 33,
1646–1661. https://doi.org/10.1002/sim.6052.

Schorning, K., Bornkamp, B., Bretz, F., and Dette, H. (2016),
“Model selection versus model averaging in dose finding
studies,” *Statistics in Medicine*, 35, 4021–4040. https://doi.org/10.1002/sim.6991.

Thomas, N., and Roy, D. (2017), “Analysis of clinical
dose–response in small-molecule drug development: 2009–2014,”
*Statistics in Biopharmaceutical Research*, 9, 137–146. https://doi.org/10.1080/19466315.2016.1256229.

Thomas, N., Sweeney, K., and Somayaji, V. (2015), “Meta-analysis
of clinical dose response in a large drug development portfolio,”
*Statistics in Biopharmaceutical Research*, 6, 302–217. https://doi.org/10.1080/19466315.2014.924876.