In modern financial markets, volatility measures the degree of dispersion for assets and plays a crucial role in portfolio allocation, performance evaluation, and risk management. Low-frequency and high-frequency stock data are widely adopted to model the dynamic evolution of daily volatilities, while efforts made for volatility estimation and prediction in the past often employ these two types of data independently. Recent attempts to bridge the gap between these two include the realized GARCH model, the heterogeneous autoregressive (HAR) model, as well as the high-frequency based volatility (HEAVY) model. See Shephard and Sheppard (2010), Hansen, Huang, and Shek (2012), Corsi (2009) for more details.

In addition, Kim and Wang (2016) introduced the unified GARCH-Ito model by embedding the standard GARCH volatility structure in the instantaneous volatilities of an Ito diffusion process. The unified GARCH-Ito model is a continuous-time process at the high-frequency timescale and when restricted to the low-frequency timescale, retains the standard GARCH structure. Moreover, Song et al. (2020) introduced the realized GARCH-Ito model by embedding the realized GARCH model structure in the instantaneous volatilities of a jump-diffusion process. Comparing to the unified GARCH-Ito model, its conditional volatility has integrated volatility and jump variation as innovations, which are high-frequency data-based innovations that are more informative.

The **RealizedGARCHIto** package aims to provide methods for modeling the high-frequency data with unified GARCH-Ito model and realized GARCH-Ito model. It provides methods to estimate model parameters and allows one to estimate and predict conditional volatilities with the two proposed models. It also includes one sample data set that has low-frequency log returns (return) and realized measures such as realized volatility (RV), bi-power realized volatility (BPV) and jump variation (JV) computed and estimated using the CSI 300 index minute data from 2018-01-01 to 2020-06-30.

**Definition** The log price \(X_t\), \(t \in \mathbb{R}_+\) obeys the unified GARCH-Ito model if it satisfies \[\begin{equation*}
\begin{split}
dX_t=& \mu_t dt + \sigma_t dB_t \\
\sigma^2_t = & \sigma^2_{[t]} + (t-[t])\lbrace \omega + (\gamma-1) \sigma^2_{[t]} \rbrace + \beta \left(\int_{[t]}^t \sigma_s dB_s \right)^2
\end{split}
\end{equation*}\] where \(\mu_t\) is a drift, \([t]\) denotes the integer part of \(t\) and when \(t\) itself is an integer, \([t]=t-1\), \(B_t\) is a Brownian motion with respect to a filtration \(\mathcal{F}_t\), \(\sigma^2_t\) is a volatility process adapted to \(\mathcal{F}_t\), \(\theta=(\omega,\beta,\gamma)\) are the model parameters.

**Proposition** The conditional integrated volatility over the low-frequency period \(n\) retains the following iterative structure \[\begin{equation*}
E \left[ \int_{n-1}^n \sigma^2_t dt \middle| \mathcal{F}_{n-1} \right] = h_n(\theta) = \omega^g + \gamma h_{n-1}(\theta) + \beta^g Z^2_{n-1}
\end{equation*}\] where \(\tau(\theta)=(\omega^g, \beta^g, \gamma)\) are model parameters in the above low-frequency structure and are functions of the original \(\theta\), \(Z_{n-1}=X_{n-1}-X_{n-2}\) is the low-frequency log return. Moreover, \[\begin{equation*}
E[h_n(\theta)]=\frac{\omega^g}{1-\beta^g-\gamma}.
\end{equation*}\] The model parameters can be estimated by maximizing the following quasi-maximum likelihood function: \[\begin{equation*}
\hat{L}_U(\theta)=-\sum_{i=1}^n \left[ \log (h_i(\theta)) + \frac{RV_i}{h_i(\theta)} \right]
\end{equation*}\] where \(RV_i\)’s are the daily realized volatility estimates. Well-performing realized volatility estimators that can handle the market microstructure noise when frequency of the intraday data is ultra-high include the two-time scale realized volatility estimator, the multi-scale realized volatility estimator, the pre-averaging realized volatility estimator, and the kernel realized volatility estimator. See Ait-Sahalia and Yu (2009), Zhang (2006), Barndorff-Nielsen et al. (2008), Jacod et al. (2009), Christensen, Kinnebrock, and Podolskij (2010) for more details.

To maximize \(\hat{L}_U(\theta)\), we need to specify \(h_1(\theta)\) and we adopt its unconditional expectation such that \[\begin{equation*} h_1(\theta)=\frac{\omega^g}{1-\beta^g-\gamma}. \end{equation*}\] We estimate the model parameter \(\theta\) by maximizing \(\hat{L}_U(\theta)\) such that \[\begin{equation*} \hat{\theta}=\underset{\theta \in \Theta}{\text{argmax}} \hat{L}_U(\theta) \end{equation*}\] and estimate the model parameter \(\tau(\theta)\) by \(\tau(\hat{\theta})\).

`## Loading required package: GARCHIto`

```
##
## Iter: 1 fn: -5311.7549 Pars: 0.000002193 0.055580454 0.856415391
## Iter: 2 fn: -5311.7549 Pars: 0.000002193 0.055580454 0.856415391
## solnp--> Completed in 2 iterations
```

```
## omega_g beta_g gamma
## 2.192970e-06 5.558045e-02 8.564154e-01
```

**Definition** The log price \(X_t\), \(t \in \mathbb{R}_+\) obeys the realized GARCH-Ito model if it satisfies \[\begin{equation*}
\begin{split}
dX_t=& \mu dt + \sigma_t dB_t + L_t d \Lambda_t \\
\sigma^2_t = & \sigma^2_{[t]} + \gamma (t-[t])^2 \lbrace \omega_1 +\sigma^2_{[t]} \rbrace - (t-[t]) \lbrace \omega_2 +\sigma^2_{[t]} \rbrace \\\
& \, + \alpha \int_{[t]}^t \sigma^2_s ds + \beta \int_{[t]}^t L^2_s d \Lambda_s + \nu ([t]+1-t) Z^2_t
\end{split}
\end{equation*}\] where \(\mu_t\) is a drift, \([t]\) denotes the integer part of \(t\) and when \(t\) itself is an integer, \([t]=t-1\), \(Z_t=\int_{[t]}^t dW_t\), \(B_t\) and \(W_t\) are standard Brownian motions with respect to filtration \(\mathcal{F}_t\) with \(dW_t dB_t= \rho dt\), and \(\sigma^2_t\) is a volatility process adapted to \(\mathcal{F}_t\). For the jump part, \(\Lambda_t\) is the standard Poisson process with constant intensity \(\lambda\) and \(L_t\) denotes the i.i.d. jump sizes which are independent of the Poisson and continuous diffusion processes. The i.i.d. assumption on jump sizes can be further rewritten as \(L^2_t =\omega_L + M_t\) where \(M_t\)’s are i.i.d. random variables with mean zero and constant variance.

**Proposition** The conditional integrated volatility over the low-frequency period \(n\) retains the following iterative structure \[\begin{equation*}
E \left[ \int_{n-1}^n \sigma^2_t dt \middle| \mathcal{F}_{n-1} \right] = h_n(\theta) = \omega^g + \gamma h_{n-1}(\theta) + \alpha^g \int_{n-2}^{n-1} \sigma^2_s ds + \beta^g \int_{n-2}^{n-1} L^2_t d \Lambda_t
\end{equation*}\] where \(\tau(\theta)=(\omega^g, \alpha^g, \beta^g, \gamma)\) are model parameters in the above low-frequency structure and are functions of the original \(\theta\). Moreover, \[\begin{equation*}
E[h_n(\theta)]=\frac{\omega^g+\beta^g \lambda \omega_L}{1-\alpha^g-\gamma}.
\end{equation*}\] The model parameters can be estimated by maximizing the following quasi-maximum likelihood function: \[\begin{equation*}
\hat{L}_R(\theta)= - \sum_{i=1}^n \left[ \log (\hat{h}_i(\theta)) + \frac{RV_i}{\hat{h}_i(\theta)} \right]
\end{equation*}\] where \[\begin{equation*}
\hat{h}_1(\theta)=\frac{\omega^g+\beta^g \lambda \omega_L}{1-\alpha^g-\gamma}
\end{equation*}\] and \[\begin{equation*}
\hat{h}_i(\theta) = \omega^g + \gamma h_{i-1}(\theta) + \alpha^g RV_{i-1} + \beta^g JV_{i-1}, \quad i=2,\ldots,n,
\end{equation*}\] here \(RV_i\)’s are the daily realized volatility estimates and \(JV_i\)’s are the jump variation estimates. **Note**: in this pacakge, we approximate \(\lambda \omega_L\) in \(\hat{h}_1(\theta)\) by the median of all \(JV_i\)’s.

We estimate the model parameter \(\theta\) by maximizing \(\hat{L}_R(\theta)\) such that \[\begin{equation*} \hat{\theta}=\underset{\theta \in \Theta}{\text{argmax}} \hat{L}_R(\theta) \end{equation*}\] and estimate the model parameter \(\tau(\theta)\) by \(\tau(\hat{\theta})\).

```
##
## Iter: 1 fn: -5340.1082 Pars: 0.000004049 0.554107013 0.376603733
## Iter: 2 fn: -5340.1082 Pars: 0.000004049 0.554107013 0.376603733
## solnp--> Completed in 2 iterations
```

```
## omega_g alpha_g gamma
## 4.048833e-06 5.541070e-01 3.766037e-01
```

```
##
## Iter: 1 fn: -5340.1053 Pars: 0.000004144 0.557140538 0.000919110 0.371861977
## Iter: 2 fn: -5340.1053 Pars: 0.000004144 0.557140538 0.000919110 0.371861977
## solnp--> Completed in 2 iterations
```

```
## omega_g alpha_g beta_g gamma
## 4.143592e-06 5.571405e-01 9.191100e-04 3.718620e-01
```

```
plot(model_unified$sigma, cex=0.5, type="o", ylim=c(0,0.00035),
main="estimated conditional volatilities", ylab="", xlab="")
lines(model_realized_NJ$sigma,cex=0.5,type="o",col="blue",lty=2)
lines(model_realized$sigma,cex=0.5,type="o",col="red",lty=3, lwd=0.5)
legend("topleft", cex=0.8,
legend=c("Unified GARCH-Ito", "Realized GARCH-Ito No Jump","Realized GARCH-Ito with Jump"),
col = c("black", "blue", "red"),
lty=c(1,2,3))
```

The dynamic structure imposed in the unified GARCH-Ito and realized GARCH-Ito model allow us to predict future volatility by estimating the expected conditional integrated volatility, i.e. \(E[h_{n+1}(\theta) | \mathcal{F}_n ]\), with \(\hat{h}_{n+1}(\hat{\theta})\).

To obtain one-step-ahead estimated value of the conditional volatility, use $pred.

`## [1] 3.537236e-05 2.841266e-05 2.843882e-05`

To carry out rolling forecast with expanding window, we update the sample size for model construction at each rolling. To evaluate the model performance in volatility forecasting task, we compare \(\hat{h}_{n+1}(\hat{\theta})\) with \(RV_{n+1}\) and compute the squared prediction error \(\left( \hat{h}_{n+1}(\hat{\theta})-RV_{n+1} \right)^2\).

```
# conduct out of sample volatility forecasting and compute the mean squared prediction error
error_unified=NULL
error_realized_NJ=NULL
error_realized=NULL
for (i in 560:603){
sink("file")
model1=UnifiedEst(sample_data$BPV[1:i], sample_data$return[1:i])
error_unified=c(error_unified, (model1$pred-sample_data$BPV[i+1])^2)
model2=RealizedEst(sample_data$BPV[1:i])
error_realized_NJ=c(error_realized_NJ, (model2$pred-sample_data$BPV[i+1])^2)
model3=RealizedEst(sample_data$BPV[1:i], sample_data$JV[1:i])
error_realized=c(error_realized, (model3$pred-sample_data$BPV[i+1])^2)
sink()
}
error=c(mean(error_unified), mean(error_realized_NJ), mean(error_realized))
names(error)=c("Unified GARCH-Ito", "Realized GARCH-Ito No Jump", "Realized GARCH-Ito with Jump")
error
```

```
## Unified GARCH-Ito Realized GARCH-Ito No Jump
## 4.439021e-10 2.760417e-10
## Realized GARCH-Ito with Jump
## 2.752327e-10
```

Besides high- and low-frequency stock data, option data provide one more natural source for the more precise forecast of volatilities and have been investigated thoroughly since the seminal work of Black and Scholes (1973). Thus, Song et al. (2020) also discussed how to incorporate additional option data information in parameter estimation. Let \(NV_i\)’s be the estimated volatility values using option data, assume that \(NV_i\) and the conditional integrated volatility \(h_i(\theta)\) have the following linear relationship: \[\begin{equation} \label{eq:option} NV_{i}=b+a \, h_i(\theta)+e_i, \quad i=1,\ldots, n \end{equation}\] where \(b\) and \(a\) are the intercept and slope coefficients, respectively. Moreover, \(e_i\)’s are martingale differences with mean zero and variance \(\sigma^2_e\), and they are independent of the price process and the microstructure component.

Let \(\phi=(\omega^g, \alpha^g, \beta^g, \gamma, a,b,\sigma^2_e)\), the model parameters can be estimated by maximizing the following quasi-likelihood function, \[\begin{equation*} \hat{L}_O(\phi)=-\sum_{i=1}^n \left[ \log (\hat{h}_i(\theta) + \frac{RV_i}{\hat{h}_i(\theta)}) \right] - \sum_{i=1}^n \left[\log(\sigma^2_e) + \frac{(NV_{i}-b-a \hat{h}_i(\theta))^2}{\sigma^2_e} \right]. \end{equation*}\] \[\begin{equation*} \hat{\phi}=\underset{\phi \in \Phi}{\text{argmax}} \hat{L}_O(\phi). \end{equation*}\]

The homogeneous variance in the linear model can be generalized to heterogeneous variance such as replacing \(\sigma^2_e\) by \(\sigma^2_e h_i^\zeta (\theta)\), where \(\zeta >0\) is to adjust the level of heteroscedasticity with \(\zeta=0\) corresponding to the homogeneous case. One may replace \(\sigma^2_e\) by \(\sigma^2_e \hat{h}^\zeta_i(\theta)\) in the quasi-likelihood function \(\hat{L}_O(\phi)\) and then estimate \(\zeta\) jointly with the other parameters.

```
# without the consideration of price jumps
RealizedEst_Option(RV, NV) # homogeneous error
RealizedEst_Option(RV, NV, homogeneous=FALSE ) # heterogeneous error
# with the consideration of price jumps
RealizedEst_Option(RV, JV, NV) # homogeneous error
RealizedEst_Option(RV, JV, NV, homogeneous=FALSE) # heterogeneous error
```

Ait-Sahalia, Yacine, and Jialin Yu. 2009. “High Frequency Market Microstructure Noise Estimates and Liquidity Measures.” *Annals of Applied Statistics* 3 (1): 422–57.

Barndorff-Nielsen, Ole E, Peter Reinhard Hansen, Asger Lunde, and Neil Shephard. 2008. “Designing Realized Kernels to Measure the Ex Post Variation of Equity Prices in the Presence of Noise.” *Econometrica* 76 (6): 1481–1536.

Black, Fischer, and Myron Scholes. 1973. “The Pricing of Options and Corporate Liabilities.” *Journal of Political Economy* 81 (3): 637–54.

Christensen, Kim, Silja Kinnebrock, and Mark Podolskij. 2010. “Pre-Averaging Estimators of the Ex-Post Covariance Matrix in Noisy Diffusion Models with Non-Synchronous Data.” *Journal of Econometrics* 159 (1): 116–33.

Corsi, Fulvio. 2009. “A Simple Approximate Long-Memory Model of Realized Volatility.” *Journal of Financial Econometrics* 7 (2): 174–96.

Hansen, Peter Reinhard, Zhuo Huang, and Howard Howan Shek. 2012. “Realized GARCH: A Joint Model for Returns and Realized Measures of Volatility.” *Journal of Applied Econometrics* 27 (6): 877–906.

Jacod, Jean, Yingying Li, Per A Mykland, Mark Podolskij, and Mathias Vetter. 2009. “Microstructure Noise in the Continuous Case: The Pre-Averaging Approach.” *Stochastic Processes and Their Applications* 119 (7): 2249–76.

Kim, Donggyu, and Yazhen Wang. 2016. “Unified Discrete-Time and Continuous-Time Models and Statistical Inferences for Merged Low-Frequency and High-Frequency Financial Data.” *Journal of Econometrics* 194: 220–30.

Shephard, Neil, and Kevin Sheppard. 2010. “Realising the Future: Forecasting with High-Frequency-Based Volatility (Heavy) Models.” *Journal of Applied Econometrics* 25 (2): 197–231.

Song, Xinyu, Donggyu Kim, Huiling Yuan, Xiangyu Cui, Zhiping Lu, Yong Zhou, and Yazhen Wang. 2020. “Volatility Analysis with Realized GARCH-Itô Models.” *Journal of Econometrics* in press.

Zhang, Lan. 2006. “Efficient Estimation of Stochastic Volatility Using Noisy Observations: A Multi-Scale Approach.” *Bernoulli* 12 (6): 1019–43.