— title: “GSoC-2017_Additions” author: “Vandit Jain” date: “June 2017” output: pdf_document: default html_document: default bibliography: markovchainBiblio.bib vignette: > %
%
%VignetteEncoding{UTF-8} —
ExpectedTime
function to calculate average hitting time from one state to another. Let the final state be j, then for every state \(i \in I\), where \(I\) is the set of all possible states and holding time \(q_{i} > 0\) for every \(i \neq j\). Assuming the conditions to be true, expected hitting time is equal to minimal non-negative solution vector \(p\) to the system of linear equations[@NorrisBook]:
\[\begin{equation}
\begin{cases}
p_{k} = 0 & k = j \\
-\sum_{l \in I} q_{kl}p_{k} = 1 & k \neq j
\end{cases}
\label{eq:EHT}
\end{equation}\]
For example, consider the continuous time markovchain which is as follows:
library(markovchain)
states <- c("a","b","c","d")
byRow <- TRUE
gen <- matrix(data = c(-1, 1/2, 1/2, 0, 1/4, -1/2, 0, 1/4, 1/6, 0, -1/3, 1/6, 0, 0, 0, 0),
nrow = 4,byrow = byRow, dimnames = list(states,states))
ctmc <- new("ctmc",states = states, byrow = byRow, generator = gen, name = "testctmc")
The generator matrix of the ctmc is: \[ M = \left(\begin{matrix} -1 & 1/2 & 1/2 & 0\\ 1/4 & -1/2 & 1/4 & 1/6\\ 1/6 & 0 & -1/3 & 1/6\\ 0 & 0 & 0 & 0\\ \end{matrix}\right) \]
Now if we have to calculate expected hitting time the process will take to hit state \(d\) if we start from \(a\), we apply the \(ExpectedTime\) function. \(ExpectedTime\) function takes four inputs namely a \(ctmc\) class object, initial state \(i\), the final state \(j\) that we have to calculate expected hitting time and a logical parameter whether to use RCpp implementation. By default, the function uses RCpp as it is faster and takes lesser time.
ExpectedTime(ctmc,1,4)
## [1] 7
We find that the expected hitting time for process to be hit state \(d\) is 7 units in this case.
The package provides a function probabilityatT
to calculate probability of every state according to given ctmc
object. The Kolmogorov’s backward equation gives us a relation between transition matrix at any time t with the geneartor matrix[@dobrow2016introduction]:
Here we use the solution of this differential equation \(P(t) = P(0)e^{tQ}\) for \(t \geq 0\) and \(P(0) = I\). In this equation, \(P(t)\) is the transition function at time t. The value \(P(t)[i][j]\) at time \(P(t)\) describes the conditional probability of the state at time \(t\) to be equal to j if it was equal to i at time \(t=0\). It takes care of the case when ctmc
object has a generator represented by columns. If inital state is not provided, the function returns the whole transition matrix \(P(t)\).
Also to mention is that the function is also implemented using RCpp and can be used used to lessen the time of computation. It is used by default. Next, We consider both examples where intial state is given and case where initial state is not given.
In the first case, the function takes two inputs, first of them is an object of the S4 class ‘ctmc’ and second is the final time \(t\).
probabilityatT(ctmc,1)
## a b c d
## a 0.41546882 0.24714119 0.2703605 0.06702946
## b 0.12357060 0.63939068 0.0348290 0.20220972
## c 0.09012017 0.02321933 0.7411205 0.14553997
## d 0.00000000 0.00000000 0.0000000 1.00000000
Here we get an output in the form of a transition matrix.
If we take the second case i.e. considering some initial input:
probabilityatT(ctmc,1,1)
## [1] 0.41546882 0.24714119 0.27036052 0.06702946
In this case we get the probabilities corresponding to every state. this also includes probability that the process hits the same state \(a\) after time \(t=1\).
The package provides a plot
function for plotting a generator matrix \(Q\) in the form of a directed graph where every possible state is assigned a node. Edges connecting these nodes are weighted. Weight of the edge going from a state \(i\) to state \(j\) is equal to the value \(Q_{ij}\). This gives a picture of the generator matrix.
For example, we build a ctmc-class object to plot it.
energyStates <- c("sigma", "sigma_star")
byRow <- TRUE
gen <- matrix(data = c(-3, 3,
1, -1), nrow = 2,
byrow = byRow, dimnames = list(energyStates, energyStates))
molecularCTMC <- new("ctmc", states = energyStates,
byrow = byRow, generator = gen,
name = "Molecular Transition Model")
Now if we plot this function we get the following graph:
plot(molecularCTMC)
The figure shown is built using the \(igraph\) package. The package also provides options of plotting graph using \(diagram\) and \(DiagrameR\) package. Plot using these packages can be built using these commands:
plot(molecularCTMC,package = "diagram")
Similarly, one can easily replace \(diagram\) package with \(DiagrammeR\)
Continuous-time Markov chains are mathematical models that are used to describe the state-evolution of dynamical systems under stochastic uncertainty. However, building models usign continuous time markovchains tkae in consideration a number of assumptions which may not be realistic for the domain of application; in particular; the ability to provide exact numerical parameter assessments, and the applicability of time-homogeneity and the eponymous Markov property. Hence we take ICTMC into consideration.
More technically, an ICTMC is a set of âpreciseâ continuous-time finite-state stochastic processes, and rather than computing expected values of functions, we seek to compute lower expectations, which are tight lower bounds on the expectations that correspond to such a set of âpreciseâ models.
For any non-empty bounded set of rate matrices \(L\), and any non-empty set \(M\) of probability mass functions on \(X\), we define the following three sets of stochastic processes that are jointly consistent with \(L\) and \(M\):
From a practical point of view, after having specified a (precise) stochastic process, one is typically interested in the expected value of some function of interest, or the probability of some event. Similarly, in this work, our main objects of consideration will be the lower probabilities that correspond to the ICTMCs.
A map \(Q_{l}\) from \(L(X)\) to \(L(X)\) is called a lower transition rate operator if, for all \(f,g \in L(X)\), all \(\lambda \in R_{\geq 0}\), all \(\mu \in L(X)\), and all \(x \in X\)[@ictmcpaper]:
A map \(T_{l}\) from \(L (X )\) to \(L (X )\) is called a lower transition operator if, for all \(f,g \in L(X)\), all \(\lambda \in R_{\geq 0}\), all \(\mu \in L(X)\), and all \(x \in X\)[@ictmcpaper]:
ImpreciseprobabilityatT
functionNow I would like to come onto the practical purpose of using ICTMC classes. ICTMC classes in these package are defined to represent a generator that is defined in such a way that every row of the generator corresponding to every state in the process is governed by a separate variable. As defined earlier, an imprecise continuous time markovchain is a set of many precise CTMCs. Hence this representation of set of precise CTMCs can be used to calulate transition probability at some time in future. This can be seen as an analogy with probabilityatT
function. It is used to calculate the transition function at some later time t using generatoe matrix.
For every generator matrix, we have a corresponding transition function. Similarly, for every Lower Transition rate operator of an ICTMC, we have a corresponding lower transition operator denoted by \(L_{t}^{s}\). Here \(t\) is the initial time and \(s\) is the final time.
Now we mention a proposition[@ictmcpaper] which states that: Let \(Q_{l}\) be a lower transition rate operator, choose any time \(t\) and \(s\) both greater than 0 such that \(t \leq s\), and let \(L_{t}^{s}\) be the lower transition operator corresponding to \(Q_{l}\). Then for any \(f \in L(X)\) and \(\epsilon \in R_{>0}\), if we choose any \(n \in N\) such that:
\[n \geq max((s-t)*||Q||,\frac{1}{2\epsilon}(s-t)^{2}||Q||^{2}||f||_v)\]
with \(||f||_{v}\) := max \(f\) - min \(f\), we are guaranteed that[@ictmcpaper]
\[ ||L_{t}^{s} - \prod_{i=1}^{n}(I + \Delta Q_{l}) || \leq \epsilon\]
with \(\Delta := \frac{s-t}{n}\)
Simple put this equation tells us that, using \(Q_{l}g\) for all \(g \in L(X)\) then we can also approximate the quantity \(L_{t}^{s}\) to arbitrary precision, for any given \(f \in L(X)\).
To explain this approximate calculation, I would take a detailed example of a process containing two states healthy and sick, hence \(X = (healthy,sick)\). If we represent in form of an ICTMC, we get:
\[ Q = \left(\begin{matrix} -a & a \\ b & -b \end{matrix}\right) \] for some \(a,b \in R_{\geq 0}\). The parameter \(a\) here is the rate at which a healthy person becomes sick. Technically, this means that if a person is healthy at time \(t\), the probability that he or she will be sick at time \(t +\Delta\), for small \(\Delta\), is very close to \(\Delta a\). More intuitively, if we take the time unit to be one week, it means that he or she will, on average, become sick after \(\frac{1}{a}\) weeks. The parameter \(b\) is the rate at which a sick person becomes healthy again, and has a similar interpretation.
Now to completely represent the ICTMC we take an example and write the generator as:
\[ Q = \left(\begin{matrix} -a & a \\ b & -b \end{matrix}\right) : a \in [\frac{1}{52},\frac{3}{52}],b \in [\frac{1}{2},2] \]
Now suppose we know the initial state of the patient to be sick, hence this is reprsented in the form of a function by: \[ I_{s} = \left(\begin{matrix} 0 \\ 1 \end{matrix}\right) \] We observe that the \(||I_{s}|| = 1\). Now to use the proposition mentioned above, we use the definition to calculate the lower transition operator \(Q_{l}\) Next we calculate the norm of the lower transition rate operator and use it in the preposition. Also we take value of \(\epsilon\) to be 0.001.
Using the preposition we can come upto an algorithm for calculating the probability at any time \(s\) given state at initial time \(t\) and a ICTMC generator[@ictmcpaper].
The algorithm is as follows:
Input: A lower transition rate operator \(Q\), two time points \(t,s\) such that \(t \leq s\), a function \(f \in L(X )\) and a maximum numerical error \(\epsilon \in R_{>0}\).
Algorithm:
Output:
The conditional probability vector after time \(t\) with error \(\epsilon\). Hence, after applying the algorithm on above example we get the following result:
$ g_{n} = 0.0083$ if final state is \(healthy\) and \(g_{n} = 0.141\) if the final state is \(sick\). The probability calculated is with an error equal to \(\epsilon\) i.e. \(0.001\).
Now we run the algorithm on the example through R code.
states <- c("n","y")
Q <- matrix(c(-1,1,1,-1),nrow = 2,byrow = TRUE,dimnames = list(states,states))
range <- matrix(c(1/52,3/52,1/2,2),nrow = 2,byrow = 2)
name <- "testictmc"
ictmc <- new("ictmc",states = states,Q = Q,range = range,name = name)
impreciseProbabilityatT(ictmc,2,0,1,10^-3,TRUE)
## [1] 0.008259774 0.140983476
The probabilites we get are with an error of \(10^{-3}\)
The package provides freq2Generator
function. It takes in a matrix representing relative frequency values along with time taken to provide a continuous time markovchain generator matrix. The function also allows to chose among three methods for calculation of the generator matrix.
Three methods are as follows:
Here is an example matrix on which freq2Generator
function is run:
sample <- matrix(c(150,2,1,1,1,200,2,1,2,1,175,1,1,1,1,150),nrow = 4,byrow = TRUE)
sample_rel = rbind((sample/rowSums(sample))[1:dim(sample)[1]-1,],c(rep(0,dim(sample)[1]-1),1))
freq2Generator(sample_rel,1)
## [,1] [,2] [,3] [,4]
## [1,] -0.024212164 0.01544797 0.008764198 0
## [2,] 0.006594821 -0.01822834 0.011633520 0
## [3,] 0.013302567 0.00749703 -0.020799597 0
## [4,] 0.000000000 0.00000000 0.000000000 0
Consider set of states A,B comprising of states from a markovchain with transition matrix P. The committor vector of a markovchain with respect to sets A and B gives the probability that the process will hit a state from set A before any state from set B.
Committor vector u can be calculated by solving the following system of linear equations[@committorlink]:
\[ \begin{array}{l} Lu(x) = 0, x \notin A \cup B \\ u(x) = 1, x \in A \\ u(x) = 0, x \in B \end{array} \] where \(L = P -I\).
Now we apply the method to an example:
transMatr <- matrix(c(0,0,0,1,0.5,0.5,0,0,0,0,0.5,0,0,0,0,0,0.2,0.4,0,0,0,0.8,0.6,0,0.5),nrow = 5)
object <- new("markovchain", states=c("a","b","c","d","e"),transitionMatrix=transMatr, name="simpleMc")
committorAB(object,c(5),c(3))
## a b c d e
## 0.4444444 0.8888889 0.0000000 0.4444444 1.0000000
Here we get probability that the process will hit state “e” before state “c” given different initial states.
Currently computation of the first passage time for individual states has been implemented in the package. firstPassageMultiple
function provides a method to get first passage probability for given provided set of states.
Consider this example markovchain object:
statesNames <- c("a", "b", "c")
testmarkov <- new("markovchain", states = statesNames, transitionMatrix =
matrix(c(0.2, 0.5, 0.3,
0.5, 0.1, 0.4,
0.1, 0.8, 0.1), nrow = 3, byrow = TRUE,
dimnames = list(statesNames, statesNames)
))
Now we apply firstPassageMultiple
function to calculate first passage probabilities for set of states \("b", "c"\) when intial state is \("a"\).
firstPassageMultiple(testmarkov,"a",c("b","c"),4)
## set
## 1 0.8000
## 2 0.6000
## 3 0.2540
## 4 0.1394
This shows us the probability that the process will hit any of the state from the set after n number of steps for instance, as shown, the probability of the process to hit any of the states among \("b", "c"\) after \(2\) steps is \(0.6000\).