In this tutorial, we show how to use `ocf`

to estimate the
conditional choice probabilities and the covariatesâ€™ marginal effects,
and conduct inference about these statistical targets. For illustration
purposes, we use the synthetic data set provided in the `orf`

package:

```
## Load data from orf package.
set.seed(1986)
library(orf)
data(odata)
<- as.numeric(odata[, 1])
y <- as.matrix(odata[, -1]) X
```

The `ocf`

function constructs a collection of forests, one
for each category of `y`

(three in this case). We can then
use the forests to predict out-of-sample conditional probabilities using
the `predict`

method. By default, `predict`

returns a matrix with the predicted probabilities and a vector of
predicted class labels (each observation is labelled to the
highest-probability class).

```
## Training-test split.
<- sample(seq_len(length(y)), floor(length(y) * 0.5))
train_idx
<- y[train_idx]
y_tr <- X[train_idx, ]
X_tr
<- y[-train_idx]
y_test <- X[-train_idx, ]
X_test
## Fit ocf on training sample. Use default settings.
<- ocf(y_tr, X_tr)
forests
## Summary of data and tuning parameters.
summary(forests)
## Out-of-sample predictions.
<- predict(forests, X_test)
predictions
head(predictions$probabilities)
table(y_test, predictions$classification)
```

We can also implement honesty, which is a necessary condition to
produce asymptotically normal and consistent predictions. In the
following, we set `honesty = TRUE`

to construct honest
forests.

```
## Honest forests.
<- ocf(y_tr, X_tr, honesty = TRUE)
honest_forests <- predict(honest_forests, X_test)
honest_predictions
## Compare predictions with adaptive fit.
cbind(head(predictions$probabilities), head(honest_predictions$probabilities))
```

To estimate standard errors for the predicted probabilities, we set
`inference = TRUE`

. This requires also to set
`honesty = TRUE`

: the formula for the variance is valid only
for honest predictions. The estimation of standard errors considerably
slows down the routine. However, we can increase the number of threads
used to construct the forests to speed up the routine.

```
## Compute standard errors.
<- ocf(y_tr, X_tr, honesty = TRUE, inference = TRUE, n.threads = 0) # Use all CPUs.
honest_forests head(honest_forests$predictions$standard.errors)
```

The `marginal_effects`

function post-processes the
predictions to estimate mean marginal effects, marginal effects at the
mean, or marginal effects at the median, according to the
`eval`

argument. In the following, we construct our forests
in the training sample and use them to estimate the marginal effects at
the mean in the test sample.

```
## Fit ocf on training sample.
<- ocf(y_tr, X_tr)
forests
## Marginal effects at the mean on test sample.
<- marginal_effects(forests, data = X_test, eval = "atmean")
me_atmean summary(me_atmean)
```

As before, we can set `inference = TRUE`

to estimate the
standard errors. Again, this requires the use of honest forests and
considerably slows down the routine.

```
## Honest forests.
<- ocf(y_tr, X_tr, honesty = TRUE) # Notice we do not need inference here!
honest_forests
## Compute standard errors.
<- marginal_effects(honest_forests, data = X_test , eval = "atmean", inference = TRUE)
honest_me_atmean
## LATEX.
print(honest_me_atmean, latex = TRUE)
```