Random processes are typically collections of dependent random variables, but allowing arbitrary associations between values at different points in time makes analysis intractable. Markov processes are stochastic processes which obey a specific kind of "next step" dependence structure. For a Markov process, roughly speaking, given the present, the future is conditionally independent of the past. The dependence assumption in Markov chains allows for a rich theory and tractable probabilistic models that have applications in a wide range of situations.
Be sure to import Symbulate using the following commands.
from symbulate import *
%matplotlib inline
A discrete time Markov chain is a discrete time, discrete state random process which satisfies for all $n$:
Given $X_n$ ("the present"), $(X_{n+1}, X_{n+2}, \ldots)$ ("the future") is conditionally independent of $(X_{n-1}, X_{n-2}, \ldots, X_0)$ ("the past").
In Symbulate a discrete time Markov chain is defined with MarkovChain
. The probabilistic behavior of a discrete time Markov chain is fully specified by the following, which are the parameters of MarkovChain
.
state_labels
: The state space of possible values of the process. (Default is to label the states 0, 1, 2, ...)initia_dist
: The initial distribution, which specifies the probability distribution at time 0transition_matrix
: The (one-step) transition probability matrix, whose $(i, j)$ entry specifies the probability that the chain is in state $j$ at the next time step given that it is currently in state $i$: $P(X_{n+1} = j\, | X_n = i)$. All rows sums must be 1.Example. The weather in a certain city can be classified as either cloudy, rainy, or sunny and follows a discrete time Markov chain.
Suppose that it is sunny on Sunday. (So we'll call Sunday $n=0$.)
states = ["cloud", "rain", "sun"]
TransitionMatrix = [[0.3, 0.2, 0.5],
[0.5, 0.3, 0.2],
[0.3, 0.0, 0.7]]
InitialDistribution = [0, 0, 1] # sunny on Sunday
X = MarkovChain(TransitionMatrix, InitialDistribution, states)
Find the probability that it is rainy on Friday ($n=5$).
X[5].sim(10000).tabulate(normalize = True)
Find the conditional probability that it is rainy on Friday given that it is rainy on Thursday. (The following should return, approximately, the second row of the transition matrix.)
(X[5] | (X[4] == "rain")).sim(10000).tabulate(normalize = True)
Find the conditional probability that it is rainy on Friday given that it is rainy on Thursday and cloudy on Wednesday. (This demonstrates the Markov property: conditioning additionally on the value of $X_3$ does not change the conditional distribution from the previous part.)
(X[5] | ((X[4] == "rain") & (X[3] == "cloud"))).sim(10000).tabulate(normalize = True)
Find the probability that it is rainy on Friday and Saturday.
(X[5] & X[6]).sim(10000).tabulate(normalize = True)
The state space can be any list of values (like ['cloud', 'rain', 'sun']). If state_labels
are not specified, the default is to label the states 0, 1, 2, ... When the states are numerical values, plots can be created, and methods like .mean()
and .sd()
can be applied.
TransitionMatrix = [[0.3, 0.2, 0.5],
[0.5, 0.3, 0.2],
[0.3, 0.0, 0.7]]
InitialDistribution = [0, 0, 1] # sunny on Sunday
X = MarkovChain(TransitionMatrix, InitialDistribution)
X.sim(1).plot(alpha = 1)
X.sim(10).plot()
X[5].sim(10000).plot()
(X[5] | (X[4] == 1) ).sim(10000).plot()
(X[4] & X[5]).sim(10000).plot(jitter = True)
A continuous time Markov chain is a continuous time, discrete state random process which satisfies for all $t$:
Given $X_t$ ("the present"), $(X_{u},u \ge t)$ ("the future") is conditionally independent of $(X_{s}, s \le t )$ ("the past").
In a discrete time Markov chain, state transitions occur at every point in time, $n = 0, 1, 2, \ldots$. A continuous time Markov chain behaves like a discrete time Markov chain in which the times between state transitions are independent and exponentially distributed.
In Symbulate a continuous time Markov chain is defined with ContinuousTimeMarkovChain
. The probabilistic behavior of a continuous time Markov chain is fully specified by the following, which are the parameters of ContinuousMarkovChain
.
state_labels
: The state space of possible values of the process. (Default is to label the states 0, 1, 2, ...)initial_dist
: The initial distribution, which specifies the probability distribution at time 0generator_matrix
: The generator matrix or transition rate matrix, $Q$, whose $(i, j)$ entry specifies the rate at which the chain "attempts to transition" to state $j$ given that it is currently in state $i$.Example. The weather in a certain city can be classified as either cloudy, rainy, or sunny and follows a continuous time Markov chain.
Suppose that it is currently sunny.
states = ["cloud", "rain", "sun"]
Q = [[-0.50, 0.15, 0.35],
[ 0.60, -1, 0.40],
[ 1/3, 0.0, -1/3]]
InitialDistribution = [0, 0, 1] # sunny currently
X = ContinuousTimeMarkovChain(Q, InitialDistribution, states)
If it is currently sunny, find the probability that it is raining 36 hours from now.
X[1.5].sim(10000).tabulate(normalize = True)
Given that it is raining 36 hours from now, find the probability that it is sunny 48 hours from now.
(X[2] | (X[1.5] == "rain")).sim(10000).tabulate(normalize = True)
As for discrete time Markov chains, the state space for a continuous time Markov chain can be any list of values (like ['cloud', 'rain', 'sun']). If state_labels
are not specified, the default is to label the states 0, 1, 2, ... When the states are numerical values, plots can be created, and methods like .mean()
and .sd()
can be applied.
Q = [[-0.50, 0.15, 0.35],
[ 0.60, -1, 0.40],
[ 1/3, 0.0, -1/3]]
InitialDistribution = [0, 0, 1] # sunny currently
X = ContinuousTimeMarkovChain(Q, InitialDistribution)
X.sim(1).plot(alpha = 1)
Events occurs over time according to a Poisson process $N(t), t\ge0$, with rate $\lambda$ if at most one event occurs at a time, the times between events are indepent and exponentially distributed with rate $\lambda$, and $N(t)$ counts the number of events that occur in the time interval $[0,t]$ (with $N(0) = 0$). In other words, a Poisson process is a continuous time Markov chain whose rate matrix $Q$ satisfies $$ q(i, j) = \begin{cases} \lambda, & j = i+1,\\ -\lambda, & j = i,\\ 0, & \text{otherwise.} \end{cases} $$
In Symbulate, PoissonProcess
defines a Poisson process; the single parameter is rate
.
Example. Customers arrive according a Poisson process with rate 2 per minute.
N = PoissonProcess(rate = 2)
Simulate and plot a single sample path, and find the process value for this sample path at time 4
n = N.sim(1)
n.plot(alpha = 1)
n[4]
Simulate many sample paths and plot the mean function
sims = N.sim(100)
sims.plot()
sims.mean().plot('r--')
Approximate the distribution of $N(3)$, the number of customers who arrive in the first 3 minutes, and its mean and variance. (Should be Poisson(6).)
sims = N[3].sim(10000)
sims.plot()
sims.mean(), sims.var()
For continuous time Markov chains (including Poisson processes) the times between jumps (or arrivals) and the times of the jumps themselves are random variables. These random times can be accessed with .JumpTimes()
or InterjumpTimes()
. The jump times and interjump times are sequences; individual components can be accessed with brackets, e.g. .JumpTimes()[1]
for the time of the first jump.
The sequences of states the chain visits can be accessed with .States()
.
Example. Continuing the weather example.
states = ["cloud", "rain", "sun"]
Q = [[-0.50, 0.15, 0.35],
[ 0.60, -1, 0.40],
[ 1/3, 0.0, -1/3]]
InitialDistribution = [0, 0, 1] # sunny currently
X = ContinuousTimeMarkovChain(Q, InitialDistribution, states)
Simulate the jump times for one sample path.
X.JumpTimes().sim(1)
Simulate the state sequence for one sample path.
X.States().sim(1)
Let $T$ denote the time at which the weather moves from being currently sunny to a different state (either rainy or cloudy.)
T = X.JumpTimes()[1]
sims = T.sim(10000)
sims.plot()
sims.mean(), sims.var()
Note that InterjumpTimes are indexed starting at 0, so .InterjumpTimes()[0]
is time between time 0 and the first jump, .InterjumpTimes()[1]
is the time between the first and second jump, etc.
T = X.InterjumpTimes()[0]
sims = T.sim(10000)
sims.plot()
sims.mean(), sims.var()
For a continuous time Markov chain, the times between jumps are independent.
sims = (X.InterjumpTimes()[0] & X.InterjumpTimes()[1]).sim(10000)
sims.plot(alpha = 0.1)
sims.corr()
For a PoissonProcess
, arrival times and interarrival times can be accessed with .ArrivalTimes()
and .InterarrivalTimes()
.
Example. Let $N$ be a Poisson process with rate 2.
N = PoissonProcess(rate = 2)
Simulate the arrival times for one sample path.
N.ArrivalTimes().sim(1)
Let $T$ be the time of the first arrival. Approximate the distribution of $T$, and its mean and variance. (Should be Exponential with rate 2.)
T = N.ArrivalTimes()[1]
t = T.sim(10000)
t.plot()
t.mean(), t.var()
Approximate the conditional distribution of $T$ given that there is exactly 1 arrival in the first 3 units of time. (Should be Uniform on (0,3).)
(T | (N[3] == 1)).sim(10000).plot()
The times between the first two arrivals should be independent
W0 = N.InterarrivalTimes()[0]
W1 = N.InterarrivalTimes()[1]
sims = (W0 & W1).sim(10000)
sims.plot()
sims.corr()