Symbulate Documentation¶

Markov Processes¶

Random processes are typically collections of dependent random variables, but allowing arbitrary associations between values at different points in time makes analysis intractable. Markov processes are stochastic processes which obey a specific kind of "next step" dependence structure. For a Markov process, roughly speaking, given the present, the future is conditionally independent of the past. The dependence assumption in Markov chains allows for a rich theory and tractable probabilistic models that have applications in a wide range of situations.

Discrete time Markov chains
Continuous time Markov chains
Poisson processes
Arrival and jump times

< Random processes | Contents | Symbulate graphics >

Be sure to import Symbulate using the following commands.

from symbulate import *
%matplotlib inline

Discrete time Markov chains¶

A discrete time Markov chain is a discrete time, discrete state random process which satisfies for all $n$:

Given $X_n$ ("the present"), $(X_{n+1}, X_{n+2}, \ldots)$ ("the future") is conditionally independent of $(X_{n-1}, X_{n-2}, \ldots, X_0)$ ("the past").

In Symbulate a discrete time Markov chain is defined with MarkovChain. The probabilistic behavior of a discrete time Markov chain is fully specified by the following, which are the parameters of MarkovChain.

state_labels: The state space of possible values of the process. (Default is to label the states 0, 1, 2, ...)
initia_dist: The initial distribution, which specifies the probability distribution at time 0
transition_matrix: The (one-step) transition probability matrix, whose $(i, j)$ entry specifies the probability that the chain is in state $j$ at the next time step given that it is currently in state $i$: $P(X_{n+1} = j\, | X_n = i)$. All rows sums must be 1.

Example. The weather in a certain city can be classified as either cloudy, rainy, or sunny and follows a discrete time Markov chain.

Given that it is cloudy today, tomorrow it will be cloudy with probability 0.3, rainy with probability 0.2, or sunny with probability 0.5.
Given that it is rainy today, tomorrow it will be cloudy with probability 0.5, rainy with probability 0.3, or sunny with probability 0.2.
Given that it is sunny today, tomorrow it will be cloudy with probability 0.3, rainy with probability 0, or sunny with probability 0.7.

Suppose that it is sunny on Sunday. (So we'll call Sunday $n=0$.)

states = ["cloud", "rain", "sun"]

TransitionMatrix = [[0.3, 0.2, 0.5],
                   [0.5, 0.3, 0.2],
                   [0.3, 0.0, 0.7]]

InitialDistribution = [0, 0, 1] # sunny on Sunday

X = MarkovChain(TransitionMatrix, InitialDistribution, states)

Find the probability that it is rainy on Friday ($n=5$).

X[5].sim(10000).tabulate(normalize = True)

Find the conditional probability that it is rainy on Friday given that it is rainy on Thursday. (The following should return, approximately, the second row of the transition matrix.)

(X[5] | (X[4] == "rain")).sim(10000).tabulate(normalize = True)

Find the conditional probability that it is rainy on Friday given that it is rainy on Thursday and cloudy on Wednesday. (This demonstrates the Markov property: conditioning additionally on the value of $X_3$ does not change the conditional distribution from the previous part.)

(X[5] | ((X[4] == "rain") & (X[3] == "cloud"))).sim(10000).tabulate(normalize = True)

Find the probability that it is rainy on Friday and Saturday.

(X[5] & X[6]).sim(10000).tabulate(normalize = True)

State labels¶

The state space can be any list of values (like ['cloud', 'rain', 'sun']). If state_labels are not specified, the default is to label the states 0, 1, 2, ... When the states are numerical values, plots can be created, and methods like .mean() and .sd() can be applied.

TransitionMatrix = [[0.3, 0.2, 0.5],
                   [0.5, 0.3, 0.2],
                   [0.3, 0.0, 0.7]]

InitialDistribution = [0, 0, 1] # sunny on Sunday

X = MarkovChain(TransitionMatrix, InitialDistribution)

X.sim(1).plot(alpha = 1)

X.sim(10).plot()

X[5].sim(10000).plot()

(X[5] | (X[4] == 1) ).sim(10000).plot()

(X[4] & X[5]).sim(10000).plot(jitter = True)

Continuous time Markov chains¶

A continuous time Markov chain is a continuous time, discrete state random process which satisfies for all $t$:

Given $X_t$ ("the present"), $(X_{u},u \ge t)$ ("the future") is conditionally independent of $(X_{s}, s \le t )$ ("the past").

In a discrete time Markov chain, state transitions occur at every point in time, $n = 0, 1, 2, \ldots$. A continuous time Markov chain behaves like a discrete time Markov chain in which the times between state transitions are independent and exponentially distributed.

The amount of time a chain stays in a state has an exponential distribution, with a rate parameter that can depend on the current state.
When the chain "jumps" to a new state, the jumps behave like a discrete time Markov chain.
The times between jumps are independent.

In Symbulate a continuous time Markov chain is defined with ContinuousTimeMarkovChain. The probabilistic behavior of a continuous time Markov chain is fully specified by the following, which are the parameters of ContinuousMarkovChain.

state_labels: The state space of possible values of the process. (Default is to label the states 0, 1, 2, ...)
initial_dist: The initial distribution, which specifies the probability distribution at time 0
generator_matrix: The generator matrix or transition rate matrix, $Q$, whose $(i, j)$ entry specifies the rate at which the chain "attempts to transition" to state $j$ given that it is currently in state $i$.
- For small $h$, $P(X_{t+h} = j\, | X_t = i) \approx h q(i,j)$
- The total departure rate from states $i$ is $\lambda(i) = \sum_{j\neq i} q(i,j)$
- The diagonal entries are the $-1$ times the total departure rates from each state, $q(i,i) = -\lambda(i)$, so that the all row sums are 0.
- The probability that when the chain departs state $i$ it jumps to state $j$ is $q(i,j)/\lambda(i)$.

Example. The weather in a certain city can be classified as either cloudy, rainy, or sunny and follows a continuous time Markov chain.

Given that it is cloudy currently, it will next be rainy with probability 0.3, or sunny with probability 0.7.
Given that it is rainy currently, it will next be cloudy with probability 0.6 or sunny with probability 0.4.
Given that it is sunny currently, it will next be cloudy with probability 1.
On average it stays cloudy for 2 days, rainy for 1 day, and sunny for 3 days.

Suppose that it is currently sunny.

states = ["cloud", "rain", "sun"]

Q = [[-0.50, 0.15, 0.35],
     [ 0.60,   -1, 0.40],
     [  1/3,  0.0, -1/3]]

InitialDistribution = [0, 0, 1] # sunny currently

X = ContinuousTimeMarkovChain(Q, InitialDistribution, states)

If it is currently sunny, find the probability that it is raining 36 hours from now.

X[1.5].sim(10000).tabulate(normalize = True)

Given that it is raining 36 hours from now, find the probability that it is sunny 48 hours from now.

(X[2] | (X[1.5] == "rain")).sim(10000).tabulate(normalize = True)

State labels¶

As for discrete time Markov chains, the state space for a continuous time Markov chain can be any list of values (like ['cloud', 'rain', 'sun']). If state_labels are not specified, the default is to label the states 0, 1, 2, ... When the states are numerical values, plots can be created, and methods like .mean() and .sd() can be applied.

Q = [[-0.50, 0.15, 0.35],
     [ 0.60,   -1, 0.40],
     [  1/3,  0.0, -1/3]]


InitialDistribution = [0, 0, 1] # sunny currently

X = ContinuousTimeMarkovChain(Q, InitialDistribution)

X.sim(1).plot(alpha = 1)

Poisson processes¶

Events occurs over time according to a Poisson process $N(t), t\ge0$, with rate $\lambda$ if at most one event occurs at a time, the times between events are indepent and exponentially distributed with rate $\lambda$, and $N(t)$ counts the number of events that occur in the time interval $[0,t]$ (with $N(0) = 0$). In other words, a Poisson process is a continuous time Markov chain whose rate matrix $Q$ satisfies $$ q(i, j) = \begin{cases} \lambda, & j = i+1,\\ -\lambda, & j = i,\\ 0, & \text{otherwise.} \end{cases} $$

In Symbulate, PoissonProcess defines a Poisson process; the single parameter is rate.

Example. Customers arrive according a Poisson process with rate 2 per minute.

N = PoissonProcess(rate = 2)

Simulate and plot a single sample path, and find the process value for this sample path at time 4

n = N.sim(1)
n.plot(alpha = 1)
n[4]

Simulate many sample paths and plot the mean function

sims = N.sim(100)
sims.plot()
sims.mean().plot('r--')

Approximate the distribution of $N(3)$, the number of customers who arrive in the first 3 minutes, and its mean and variance. (Should be Poisson(6).)

sims = N[3].sim(10000)
sims.plot()
sims.mean(), sims.var()

(5.9922000000000004, 6.0945391600000001)

Arrival and jump times¶

For continuous time Markov chains (including Poisson processes) the times between jumps (or arrivals) and the times of the jumps themselves are random variables. These random times can be accessed with .JumpTimes() or InterjumpTimes(). The jump times and interjump times are sequences; individual components can be accessed with brackets, e.g. .JumpTimes()[1] for the time of the first jump.

The sequences of states the chain visits can be accessed with .States().

Example. Continuing the weather example.

states = ["cloud", "rain", "sun"]

Q = [[-0.50, 0.15, 0.35],
     [ 0.60,   -1, 0.40],
     [  1/3,  0.0, -1/3]]


InitialDistribution = [0, 0, 1] # sunny currently

X = ContinuousTimeMarkovChain(Q, InitialDistribution, states)

Simulate the jump times for one sample path.

X.JumpTimes().sim(1)

Simulate the state sequence for one sample path.

X.States().sim(1)

Let $T$ denote the time at which the weather moves from being currently sunny to a different state (either rainy or cloudy.)

T = X.JumpTimes()[1]
sims = T.sim(10000)
sims.plot()
sims.mean(), sims.var()

(3.080828641528218, 9.4752400141091755)

Note that InterjumpTimes are indexed starting at 0, so .InterjumpTimes()[0] is time between time 0 and the first jump, .InterjumpTimes()[1] is the time between the first and second jump, etc.

T = X.InterjumpTimes()[0]
sims = T.sim(10000)
sims.plot()
sims.mean(), sims.var()

(2.9948277619747201, 8.8812241527902955)

For a continuous time Markov chain, the times between jumps are independent.

sims = (X.InterjumpTimes()[0] & X.InterjumpTimes()[1]).sim(10000)
sims.plot(alpha = 0.1)
sims.corr()

-0.012627744617794182

For a PoissonProcess, arrival times and interarrival times can be accessed with .ArrivalTimes() and .InterarrivalTimes().

Example. Let $N$ be a Poisson process with rate 2.

N = PoissonProcess(rate = 2)

Simulate the arrival times for one sample path.

N.ArrivalTimes().sim(1)

Let $T$ be the time of the first arrival. Approximate the distribution of $T$, and its mean and variance. (Should be Exponential with rate 2.)

T = N.ArrivalTimes()[1]
t = T.sim(10000)
t.plot()
t.mean(), t.var()

(0.50819188430727924, 0.25792126926274039)

Approximate the conditional distribution of $T$ given that there is exactly 1 arrival in the first 3 units of time. (Should be Uniform on (0,3).)

(T | (N[3] == 1)).sim(10000).plot()

The times between the first two arrivals should be independent

W0 = N.InterarrivalTimes()[0]
W1 = N.InterarrivalTimes()[1]

sims = (W0 & W1).sim(10000)
sims.plot()
sims.corr()

0.015410044391164382

< Random processes | Contents | Symbulate graphics >

Outcome	Value
cloud	0.3187
rain	0.0882
sun	0.5931
Total	1.0

Outcome	Value
cloud	0.4983
rain	0.2955
sun	0.2062
Total	1.0

Outcome	Value
cloud	0.4981
rain	0.3001
sun	0.2018
Total	1.0

Outcome	Value
('cloud', 'cloud')	0.0981
('cloud', 'rain')	0.0672
('cloud', 'sun')	0.1594
('rain', 'cloud')	0.0423
('rain', 'rain')	0.0244
('rain', 'sun')	0.0173
('sun', 'cloud')	0.1787
('sun', 'sun')	0.4126
Total	1.0

Outcome	Value
cloud	0.2864
rain	0.026
sun	0.6876
Total	1.0