< Contents | Simulation tools >
A probability space consists of a sample space of possible outcomes and a probability measure which specifies how to assign probabilities to related events. Several common probability spaces are available in Symbulate. Users can also define their own probability spaces.
from symbulate import *
%matplotlib inline
The probability space in many elementary situations can be defined via a "box model." To define a Symbulate BoxModel
enter a list repesenting the tickets in the box. For example, rolling a fair six-sided die could be represented as a box model with six tickets labeled 1 through 6.
die = [1, 2, 3, 4, 5, 6]
roll = BoxModel(die)
The list of numbers could also have been created using range()
in Python. Remember that Python indexing starts from 0 by default. Remember also that range
gives you all the values, up to, but not including the last value.
die = list(range(1, 6+1)) # this is just a list of the number 1 through 6
roll = BoxModel(die)
BoxModel
itself just defines the model; it does not return any values. (The same is true for any probability space.) The .draw()
method can be used to simulate one draw from the BoxModel
(or any probability space).
roll.draw()
box
: A list of "tickets" to sample from.size
: How many tickets to draw from the box.replace
: True
if the draws are made with replacement; False
if without replacementprobs
: Probabilities that the tickets are selected. By default, all tickets are equally likely.order_matters
: True
if different orderings of the same tickets drawn are counted as different outcomes; False
if the order in which the tickets are drawn is irrelevant.Multiple tickets can be drawn from the box using the size
argument.
BoxModel(die, size=3).draw()
By default BoxModel
assumes equally likely tickets. This can be changed using the probs
argument, by specifying a probability value for each ticket.
Example. Suppose 32% of Americans are Democrats, 27% are Republican, and 41% are Independent. Five randomly selected Americans are surveyed about their political party affiliation.
This situation could be represented as sampling with replacement from a box with 100 tickets, 32 of which are Democrat, etc, from which 5 tickets are drawn. But rather than specifying a list of 100 tickets, we can just specify the three tickets and the corresponding probabilities with probs
.
BoxModel(['D', 'R', 'I'], probs=[0.32, 0.27, 0.41], size=5).draw()
The probs
argument requires that the probabilities are already normalized to sum to 1. Non-normalized values can be handled by entering the tickets as a dictionary, specifying the label on each ticket and the number of tickets in the box with that label. Note that a dictionary is enclosed in braces {}
rather than brackets []
.
The following code is equivalent to the previous code which used the probs
option.
BoxModel({'D': 32,'R': 27, 'I': 41}, size=5).draw()
By default BoxModel
assumes sampling with replacement; each ticket is placed back in the box before the next ticket is selected. Sampling without replacement can be handled with replace=False
. (The default is replace=True
.)
Example. Two people are selected at random from Anakin, Bella, Frodo, Harry, Katniss to go on a quest.
BoxModel(['A','B','F','H','K'], size=2, replace=False).draw()
Note that by default, BoxModel
returns ordered outcomes, e.g. ('A', 'B') is distinct from ('B', 'A'). To return unordered outcomes, set order_matters=False
.
Symbulate has many common probability models built in. The ProbabilitySpace
command allows for user defined probability models. The first step in creating a probability space is to define a function that explains how to draw one outcome.
Example. Ten percent of all e-mail is spam. Thirty percent of spam e-mails contain the word "money", while 2% of non-spam e-mails contain the word "money". Suppose an e-mail contains the word "money". What is the probability that it is spam?
We can think of the sample space of outcomes of pairs of the possible email types (spam or not) and wordings (money or not), with the probability measure following the above specifications. First we draw from a BoxModel
to determine the email type. Then, depending on the result of the first draw, we draw from one of two BoxModel
s to determine the wording. The function spam_sim
below encodes these specifications; note the use of .draw()
.
def spam_sim():
email_type = BoxModel(["spam", "not spam"], probs=[.1, .9]).draw()
if email_type == "spam":
has_money = BoxModel(["money", "no money"], probs=[.3, .7]).draw()
else:
has_money = BoxModel(["money", "no money"], probs=[.02, .98]).draw()
return email_type, has_money
A ProbabilitySpace
can be created once the specifications of the simulation have been defined through a function.
P = ProbabilitySpace(spam_sim)
P.draw()
Symbulate has many commonly used probability spaces built in. Here are just a few examples.
Binomial(n=10, p=0.5).draw()
Normal(mean=0, sd=1).draw()
mean_vector = [0, 1, 2]
cov_matrix = [[1.00, 0.50, 0.25],
[0.50, 2.00, 0.00],
[0.25, 0.00, 4.00]]
MultivariateNormal(mean = mean_vector, cov = cov_matrix).draw()
Independent probability spaces can be constructed by multiplying (*
in Python) two probability spaces. The product *
syntax reflects that under independence joint probabilities are products of marginal probabilities: For example, events $A$ and $B$ are independent if and only if $P(A\cap B) = P(A)P(B)$.
Multiple independent copies of a probability space can be created by raising a probability space to a power (**
in Python).
Example. Roll a fair six-sided die and a fair four-sided die.
die6 = list(range(1, 6+1, 1))
die4 = list(range(1, 4+1, 1))
rolls = BoxModel(die6) * BoxModel(die4)
rolls.draw()
Example. A triple of independent outcomes
(BoxModel(['H', 'T']) * Poisson(lam=2) * Exponential(rate=5)).draw()
Example. Four independent Normal(0,1) values.
P = Normal(mean=0, sd=1) ** 4
P.draw()
Infinitely many independent copies of a probability space can be created by raising the probabilty space to the inf
power, i.e. ** inf
Example. Infinitely many independent Normal(0, 1) values.
P = Normal(mean=0, sd=1) ** inf
P.draw()
< Contents | Simulation tools >