Constructing the likelihood

Before reading this section, it might be worth while for statisticians unfamiliar with likelihood inference in the context of diffusion processes to read the section on likelihood inference for purely diffuse processes in the DiffusionRgqd package. However, the concepts outlined here follow naturally from those outlined in the vignette on generating transition densities.

Consider a process observed discretely at time epochs \(t_1,t_2,\ldots, t_n\) giving rise to the time series \(D_S = \{\mathbf{X}_{t_1},\mathbf{X}_{t_2},\ldots,\mathbf{X}_{t_n}\}\). In the case of jump free diffusions we operate under the assumptions that the time series is observed without error (or at least a high degree of precision) and that the data generating process is a jump free process. Discarding the latter assumption and allowing for the data generating process to contain jumps presents subtle yet significant complications with respect to performing inference in practice. Whilst accounting for discontinuous behaviour in a diffusion model by including a jump processes in the dynamics of a model may serve to explain important features of the data at hand, it is important to note that it creates an element of inherent latency when the process of interest is observed at discrete points within a finite time horizon. This follows since we only observe the diffuse part of the underlying process in the sense that the jump part is only visible through its effect on the trajectories of the diffuse part, \(X_t\). This latency manifests in various ways depending on the nature of the jump process and the resolution of the observed series. Collectively these difficulties amount to informational latency rather than a methodological issue. For example, in practice, it would rarely be a problem to have an immeasurably weak jump signal as the motivation for modelling real world phenomena with a jump diffusion model usually stems from a priori knowledge of the presence of ‘jump’ behaviour in an observed series (Honore 1998). This however does not preclude the quality of such a signal. Furthermore, having strong evidence for the presence of jump does not circumvent the need for a sufficiently long time series. This stems from the fact that we have to distil both the rate at which jumps occur (long run dynamics) as well as the distributional qualities of the jumps (instantaneous dynamics) from the distorted signal - whatever the data resolution may be.

Given observations \(D_S\) and some jump diffusion model parametrized by the vector \(\boldsymbol\theta\), we can formulate the likelihood mathematically from the transitional density using the usual Markov arguments: \[ L(\boldsymbol\theta|D_S) \propto \prod_{i=1}^{n-1} f(\textbf{X}_{t_{i+1}}|\textbf{X}_{t_{i}},\boldsymbol\theta). \] Subsequently applying the excess factorization (see the vignette on generating transition densities for more details on the factorization), we can write the likelihood as: \[ \begin{aligned} L(\boldsymbol\theta|D_S) \propto& \prod_{i=1}^{n-1} P(\textbf{N}_{t_{i+1}}-\textbf{N}_{t_{i}}=0)f_D(\textbf{X}_{t_{i+1}}|\textbf{X}_{t_{i}},\boldsymbol\theta)\\ &+ \prod_{i=1}^{n-1}P(\textbf{N}_{t_{i+1}}-\textbf{N}_{t_{i}}>0)f_E(\textbf{X}_{t_{i+1}}|\textbf{X}_{t_{i}},\boldsymbol\theta).\\ \end{aligned} \] Using this formulation, the ability to perform inference on jump diffusions hinges on the data being of sufficiently high resolution and/or the jump signal being strong enough at the given resolution, in order for the information about the jump process to manifest as the contrast between the diffuse and the excess distributions \(f_D(.)\) and \(f_E(.)\) respectively. Fortunately, as is the nature of datasets to which jump models are applied, that presence of a jump signal is usually what motivates the use of a jump diffusion model in the first place. Thus, using the moment truncation methodology in conjunction with the excess factorization, it is possible to perform inference on a wide range of non-linear, time-inhomogneous jump diffusion models.


Simulated diffusion processes

CIR process with state-dependent intensity and leptokurtic jump distribution

Consider a jump diffusion governed by the parametrised SDE:

\[ \begin{aligned} dX_t &= \theta_1(\theta_2-X_t)dt +\theta_3\sqrt{X_t}dB_t +dP_t\\ dP_t &= \dot{z}_t dN_t \end{aligned} \]

with intensity \(\lambda(X_t,t) = \theta_4 X_t\) and where \(\dot{z}_t \sim \mbox{Laplace}(\theta_5,\theta_6)\). For conveinince we have included a simulates trajectory of this process under the parameter set \(\boldsymbol\theta = \{1,5,0.25,0.5,0.5,1\}\). Subsequently, we can run the experiment using the R code:

library(DiffusionRjgqd)
data(JSDEsim1)
attach(JSDEsim1)
plot(JSDEsim1$Xt~JSDEsim1$time,type='l',col='blue',xlab='Time (t)',ylab=expression(X[t]),main='Simulated trajectory')
#------------------------------------------------------------------------------
# Define parameterized coefficients of the process, and set up starting
# parameters.
# True model: dX_t = 1(5-X_t)dt+0.25sqrt{X_t}dW_t +dP_t
#             where dP_t = z_tdN_t describes a Poisson process with intensity:
#             lambda(X_t) = 0.5X_t
#             and
#             z_t ~ Laplace(0.5,1)
#------------------------------------------------------------------------------

# Define the model:
JGQD.remove()
G0 <-function(t){theta[1]*theta[2]}
G1 <-function(t){-theta[1]}
Q1 <-function(t){theta[3]*theta[3]}
Ja <-function(t){theta[5]}
Jb <-function(t){theta[6]}
Lam1 <-function(t){theta[4]}
priors <-function(theta){dunif(theta[4],0.01,10)*dunif(theta[6],0.01,10)}

# Define some starting parameters and run the MCMC:
updates <-5000
burns   <-1000
theta   <-c(rep(1,3),c(1,0,1))
sds     <-c(0.154,0.171,0.037,0.156,0.155,0.1)/1
model_1 <-JGQD.mcmc(JSDEsim1$Xt,JSDEsim1$time,mesh=20,theta=theta,sds=sds,
                    updates=updates,burns=burns,Jdist='Laplace')
 ================================================================
            Jump Generalized Quadratic Diffusion (JGQD)           
 ================================================================
 _____________________ Drift Coefficients _______________________
 G0 : theta[1]*theta[2]                                          
 G1 : -theta[1]                                                  
 G2                                                              
 ___________________ Diffusion Coefficients _____________________
 Q0                                                              
 Q1 : theta[3]*theta[3]                                          
 Q2                                                              
 _______________________ Jump Components ________________________
 Lam0                                                            
 Lam1 : theta[4]                                                 
 ........................... Jumps ..............................
 Laplace                                                         
 Ja : theta[5]                                                   
 Jb : theta[6]                                                   
 __________________ Distribution Approximant ____________________
 Density approx. : Saddlepoint                                   
 Trunc. Order    : 4                                             
 Dens.  Order    : 4                                             
=================================================================
                                                                       
 _______________________ Jump Components ________________________      
 Chain Updates       : 50000                                           
 Burned Updates      : 10000                                           
 Time Homogeneous    : Yes                                             
 Data Resolution     : Homogeneous: dt=0.2                             
 # Removed Transits. : None                                            
 Density approx.     : 4 Ord. Truncation +4th Ord. Saddlepoint Appr.   
 Elapsed time        : 00:04:15                                        
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...       
 dim(theta)          : 6                                               
 DIC                 : 644.123                                         
 pd (eff. dim(theta)): 5.472                                           
 ----------------------------------------------------------------
JGQD.estimates(model_1,thin=200,burns)
         Estimate Lower_90 Upper_90
theta[1]     0.93     0.81     1.05
theta[2]     5.04     4.75     5.32
theta[3]     0.31     0.26     0.36
theta[4]     0.40     0.29     0.56
theta[5]     0.55     0.26     0.82
theta[6]     0.99     0.78     1.21

A bivariate non-linear, coupled process with bivariate Normal jump distribution

Following along the lines of the previous example, we investigate a simulated jump diffusion governed by the parametrised SDE:

\[ \begin{aligned} dX_t &= dX_t = 0.5(2+Y_t-X_t)dt+0.1\sqrt{X_tY_t}dB_t+dP_t^1\\ dY_t & =dX_t = (5-X_t)dt+0.1\sqrt{Y_t}dW_t+dP_t^2\\ \end{aligned} \] where

\[ \begin{aligned} dP_t^1&=\dot{z}_1dN_t^1\\ dP_t^2&=\dot{z}_2dN_t^1\\ \end{aligned} \]

with intensity \(\lambda(X_t,Y_t) = 1\) and where \[ \{\dot{z}_1,\dot{z}_2\}^\prime\sim\mbox{Bivariate Normal}(\{0.5,0.5\}^\prime,diag(\{0.5,0.5\})). \] Subsequently, we can run the experiment using the R code:

library(DiffusionRjgqd)
data(JSDEsim2)
attach(JSDEsim2)
data(JSDEsim3)
attach(JSDEsim3)
#------------------------------------------------------------------------------
# Define parameterized coefficients of the process, and set up starting
# parameters.
# True model: dX_t = 0.5(2+Y_t-X_t)dt+0.1sqrt{X_tY_t}dB_t +dP_t^1
#             dX_t = 1(5-Y_t)dt+0.1sqrt{X_t}dW_t +dP_t^2
#             where dP_t^1 = z_tdN_t, dP_t^1 = z_tdN_t describes a Poisson
#             process with intensity:
#             lambda(X_t,Y_t) = 1
#             and
#             {z_1,z_2}' ~ Bivariate Normal({0.5,0.5}',diag({0.5,0.5}'))
#------------------------------------------------------------------------------
par(mfrow=c(1,1))

plot(JSDEsim2$Xt~JSDEsim2$time,type='l',col='#BBCCEE',ylim=c(-3,13),xlim=c(0,60),axes=F,main='Simulated Trajectory',xlab = 'Time',ylab ='X_t')
lines(JSDEsim2$Yt~JSDEsim2$time,type='l',col='#222299')

axis(1,at=seq(0,50,5))
axis(1,at=seq(0,50,5/5),tcl=-0.2,labels=NA)
axis(2,at=seq(-3,13,1))
axis(2,at=seq(-3,13,1/5),tcl=-0.2,labels=NA)

lines(Xjumps~Jtime,type='h',col='#BBCCEE')
lines(Yjumps~Jtime,type='h',col='#222299')
mx=mean(Xjumps)
sx=sd(Xjumps)
my=mean(Yjumps)
sy=sd(Yjumps)
segments(50,-5,50,9,lty='dotted')
xx=seq(-3,3,1/10)
yy=dnorm(xx,mx,sx)
yy = (yy-min(yy))/(max(yy)-min(yy))*9+51*1
lines(xx~yy,col='#BBCCEE')
yy=dnorm(xx,my,sy)
yy = (yy-min(yy))/(max(yy)-min(yy))*9+51*1
lines(xx~yy,col='#222299')

text(55,10.0,substitute(hat(theta)[8]==a,list(a=round(mx,2))),cex=0.8)
text(55,9.0,substitute(hat(theta)[9]==a,list(a=round(sx,2))),cex=0.8)
text(55,8.0,substitute(hat(theta)[10]==a,list(a=round(my,2))),cex=0.8)
text(55,7.0,substitute(hat(theta)[11]==a,list(a=round(sy,2))),cex=0.8)
abline(h=0,lty='dotted')
X=cbind(JSDEsim2$Xt,JSDEsim2$Yt)

# Define the model:
JGQD.remove()
a00 <-function(t){theta[1]*theta[2]}
a10 <-function(t){-theta[1]}
a01 <-function(t){theta[1]}
c11 <-function(t){theta[3]*theta[3]}

b00 <-function(t){theta[4]*theta[5]}
b01 <-function(t){-theta[4]}
f01 <-function(t){theta[6]*theta[6]}

# Constant intensity
Lam00= function(t){theta[7]}

# Normal jumps:
Jmu1   <-function(t){theta[8]}
Jmu2   <-function(t){theta[9]}
Jsig11 <-function(t){theta[10]*theta[10]}
Jsig22 <-function(t){theta[11]*theta[11]}

# Some starting parameters:
theta  <-c(rep(1,11))
sds    <-c(0.08,0.22,0.01,0.04,0.16,0.01,0.10,0.07,0.09,0.05,0.09)/2
burns  <-10000
updates<-50000

res <-BiJGQD.mcmc(X,JSDEsim2$time,mesh=10,theta,sds,updates,burns=burns)
Compiling C++ code. Please wait.  
                                      
                                                                 
 ================================================================
                   GENERALIZED QUADRATIC DIFFUSON                
 ================================================================
 _____________________ Drift Coefficients _______________________
 a00 : theta[1]*theta[2]                                         
 a10 : -theta[1]                                                 
 a01 : theta[1]                                                  
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ... 
 b00 : theta[4]*theta[5]                                         
 b01 : -theta[4]                                                 
 ___________________ Diffusion Coefficients _____________________
 c11 : theta[3]*theta[3]                                         
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ... 
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ... 
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ... 
 f01 : theta[6]*theta[6]                                         
 _______________________ Jump Components ________________________
 ......................... Intensity ............................
 Lam00 : theta[7]                                                
 ........................... Jumps ..............................
 Jmu1 : theta[8]                                                 
 Jmu2 : theta[9]                                                 
 Jsig11 : theta[10]*theta[10]                                    
 Jsig22 : theta[11]*theta[11]                                    
 _____________________ Prior Distributions ______________________
                                                                 
 d(theta):None.
 ================================================================
                                                                  
 _______________________ Model/Chain Info _______________________
 Chain Updates       : 50000                                     
 Burned Updates      : 10000                                     
 Time Homogeneous    : Yes                                       
 Data Resolution     : Homogeneous: dt=0.1                       
 # Removed Transits. : None                                      
 Density approx.     : 4th Ord. Truncation, Bivariate-Saddlepoint
 Elapsed time        : 00:14:28                                  
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ... 
 dim(theta)          : 11                                        
 DIC                 : -855.021                                  
 pd (eff. dim(theta)): 10.704                                    
 ----------------------------------------------------------------
JGQD.estimates(res,thin=200)[,1:3]
          Estimate Lower_90 Upper_90
theta[1]      0.54     0.39     0.68
theta[2]      1.92     1.47     2.33
theta[3]      0.11     0.10     0.11
theta[4]      1.05     0.92     1.18
theta[5]      4.96     4.88     5.01
theta[6]      0.11     0.10     0.11
theta[7]      1.10     0.82     1.43
theta[8]      0.46     0.30     0.61
theta[9]      0.31     0.21     0.45
theta[10]     0.53     0.41     0.70
theta[11]     0.55     0.47     0.66

A jump diffusion model of Google equity volatility

An interesting application of jump diffusion models is in the modelling of volatility time series. Indeed, much of the focus of financial mathematics goes into understanding and quantifying volitility processes and risks associated with movements in financial phenomena. Consider for example the volatility of Goodle equity. The Chicago Board Options Exchange (CBOE) provides the Google equity volatility as a temporal measure of the volatility in Google’s stock price. For purposes of this exposition we source data at weekly resolution using the Quandl package:

 library(Quandl)
 library(DiffusionRjgqd)

 # Source data for the Google VIX.
 quandldata1 <- Quandl("CBOE/VXGOG", collapse="weekly",
 start_date="2013-01-01",end_date="2016-01-01", type="raw")
 Vt <- rev(quandldata1[,names(quandldata1)=='Close'])
 time1 <-rev(quandldata1[,names(quandldata1)=='Date'])

 plot(Vt~time1,type='l',col='#4683C1',main='Google Equity VIX (VXGOG)',
      xlab = 'Time',ylab ='Volatility %',lwd=1)

Now, consider our model for the volatility series:

\[ \begin{aligned} dX_t &= \theta_1(\theta_2+\theta_7\sin(8\pi t)+\theta_7\cos(8\pi t)-X_t)dt+\theta_3X_tdB_t +dP_t\\ dP_t &= \dot{z}_tdN_t\\ \end{aligned} \] where \(\lambda(X_t,t) = \theta_4\) and \(\dot{z}_t\sim \mbox{N}(\theta_5,\theta_6^2)\). In order to facilitate the estimation procedure we assume priors on the parameter space: \[ \begin{aligned} \theta_1 &\sim \mbox{U}(0,100)\\ \theta_3 &\sim \mbox{U}(0,100)\\ \theta_4 &\sim \mbox{U}(0,100)\\ \theta_6 &\sim \mbox{U}(0,100)\\ \theta_2 &\sim \mbox{N}(25,5^2).\\ \end{aligned} \] Using the JGQD.mcmc() we can fit the jump diffusion process to the observed series and calculate parameter estimates:

 JGQD.remove()
 G0     = function(t){theta[1]*(theta[2]+theta[7]*sin(8*pi*t)+theta[8]*cos(8*pi*t))}
 G1     = function(t){-theta[1]}
 Q2     = function(t){theta[3]*theta[3]}
 Lam0   = function(t){theta[4]}
 Jmu    = function(t){theta[5]}
 Jsig   = function(t){theta[6]}
 priors = function(theta)
 {
    dunif(theta[1],0,100)*dunif(theta[3],0,100)*
    dunif(theta[4],0,100)*dunif(theta[6],0,100)*
    dnorm(theta[2],25,5)
 }

 X    <- Vt
 time <- cumsum(c(0,diff(as.Date(time1))*(1/365)))
 updates = 75000
 burns   = 10000
 theta   = c(6,50,3,0.05,0.1,0.1,10,10)
 sds     = c(5.30,2.2,0.09,4.96,1.58,1.63,3.17,1.57)/2
 model_1 = JGQD.mcmc(X,time,10,theta,sds,updates,burns)
 ================================================================  
            Jump Generalized Quadratic Diffusion (JGQD)            
 ================================================================  
 _____________________ Drift Coefficients _______________________  
 G0 : theta[1]*(theta[2]+theta[7]*sin(8*pi*t)+theta[8]*cos(8*pi*t))
 G1 : -theta[1]                                                    
 G2                                                                
 ___________________ Diffusion Coefficients _____________________  
 Q0                                                                
 Q1                                                                
 Q2 : theta[3]*theta[3]                                            
 _______________________ Jump Components ________________________  
 Lam0 : theta[4]                                                   
 Lam1                                                              
 ........................... Jumps ..............................  
 Normal                                                            
 Jmu : theta[5]                                                    
 Jsig : theta[6]                                                   
 __________________ Distribution Approximant ____________________  
 Density approx. : Saddlepoint                                     
 Trunc. Order    : 4                                               
 Dens.  Order    : 4     
 ================================================================
 _______________________ Jump Components ________________________      
 Chain Updates       : 75000                                           
 Burned Updates      : 10000                                           
 Time Homogeneous    : No                                              
 Data Resolution     : Homogeneous: dt=0.0192                          
 # Removed Transits. : None                                            
 Density approx.     : 4 Ord. Truncation +4th Ord. Saddlepoint Appr.   
 Elapsed time        : 00:13:31                                        
 ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...       
 dim(theta)          : 8                                               
 DIC                 : 762.927                                         
 pd (eff. dim(theta)): 6.452                                           
 ---------------------------------------------------------------- 
 # Calculate parameter estimates:
 ests = JGQD.estimates(model_1, 250, burns)
 ests 
         Estimate Lower_90 Upper_90
theta[1]    11.43     4.04    20.39
theta[2]    25.16    23.05    28.63
theta[3]     0.64     0.52     0.77
theta[4]    11.13     5.42    19.65
theta[5]    -1.06    -3.74     1.18
theta[6]     6.40     4.39     9.31
theta[7]    -7.79   -15.41    -4.10
theta[8]     1.60    -1.72     4.53

 # Make a histogram of the mean jump probability per transition horizon:
 par(mfrow=c(1,1))
 hist(model_1$zero.jump[seq(burns,updates,1)],freq= FALSE,col='#F7F7F7',
   xlab='Probability', main='Mean Jump Probability per Transition')

Using the parameter estimates, we can visuslise the fitted model by making use of its transitional density. In R this can be achieved using the JGQD.density() function:

 theta = ests[,1]
 res = JGQD.density(rev(X)[1],seq(10,50,1/10),time[1],time[1]+3, print.output = FALSE)
 library(colorspace)
 colpal=function(n){rev(sequential_hcl(n,power=1,l=c(40,100)))}
 par(mfrow=c(1,1))
 filled.contour(res$time,res$Xt,t(res$density),color.palette =colpal,
 main = 'Fitted transitional density',
 plot.axes={
 axis(1);axis(2);
 lines(Vt~time,col='black')})

Now, compare this to a standard time-inhomogeneous CIR model of the equity volatility:

# Purely diffuse process:
 library(DiffusionRgqd)
 GQD.remove()
 G0     = function(t){theta[1]*(theta[2]+theta[4]*sin(2*pi*t*4)+theta[5]*cos(2*pi*t*4))}
 G1     = function(t){-theta[1]}
 Q2     = function(t){theta[3]*theta[3]}
  priors = function(theta)
 {
    dunif(theta[1],0,100)*dunif(theta[3],0,100)*dnorm(theta[2],25,5)
 }
 theta   = c(3,20,1,0.1,0.1)
 sds     =  c(4.92,0.61,0.07,1.11,0.87)
 model_2 = GQD.mcmc(X,time,10,theta,sds,updates,burns, print.output = FALSE)
 #  Compare DIC values:
 JGQD.dic(list(model_1,model_2))
        Elapsed_Time Time_Homogeneous      p         DIC     pD   N
Model 1     00:13:31               No   8.00  [=] 762.39   5.23 157
Model 2     00:05:31               No   5.00      777.44   4.85 157

Based on the DIC statistics for these models, the addition of the jump mechanism clearly improves model fit.


References

Eckner, A. (2009). “Computational techniques for basic affine models of portfolio credit risk.” Journal of Computational Finance, 13(1):63.

Honore, P. (1998). “Pitfalls in estimating jump-diffusion models.” Available at SSRN 61998.


Further reading

browseVignettes('DiffusionRjgqd')