< Contents | Simulation tools >

**probability space** consists of a **sample space** of possible outcomes and a **probability measure** which specifies how to assign probabilities to related events. Several common probability spaces are available in Symbulate. Users can also define their own probability spaces.

**BoxModel:**Define a simple box model probability space.**Draw:**Draw an outcome according to a probability model.**ProbabilitySpace:**Define more general probability spaces.**Independent spaces:**Combine independent probability spaces.

In [1]:

```
from symbulate import *
%matplotlib inline
```

The probability space in many elementary situations can be defined via a "box model." To define a Symbulate `BoxModel`

enter a list repesenting the tickets in the box. For example, rolling a fair six-sided die could be represented as a box model with six tickets labeled 1 through 6.

In [2]:

```
die = [1, 2, 3, 4, 5, 6]
roll = BoxModel(die)
```

`range()`

in Python. Remember that Python indexing starts from 0 by default. Remember also that `range`

gives you all the values, up to, but *not including* the last value.

In [3]:

```
die = list(range(1, 6+1)) # this is just a list of the number 1 through 6
roll = BoxModel(die)
```

`BoxModel`

itself just defines the model; it does not return any values. (The same is true for any probability space.) The `.draw()`

method can be used to simulate one draw from the `BoxModel`

(or any probability space).

In [4]:

```
roll.draw()
```

Out[4]:

`box`

: A list of "tickets" to sample from.`size`

: How many tickets to draw from the box.`replace`

:`True`

if the draws are made with replacement;`False`

if without replacement`probs`

: Probabilities that the tickets are selected. By default, all tickets are equally likely.`order_matters`

:`True`

if different orderings of the same tickets drawn are counted as different outcomes;`False`

if the order in which the tickets are drawn is irrelevant.

Multiple tickets can be drawn from the box using the `size`

argument.

In [5]:

```
BoxModel(die, size=3).draw()
```

Out[5]:

By default `BoxModel`

assumes equally likely tickets. This can be changed using the `probs`

argument, by specifying a probability value for each ticket.

*Example.* Suppose 32% of Americans are Democrats, 27% are Republican, and 41% are Independent. Five randomly selected Americans are surveyed about their political party affiliation.

This situation could be represented as sampling with replacement from a box with 100 tickets, 32 of which are Democrat, etc, from which 5 tickets are drawn. But rather than specifying a list of 100 tickets, we can just specify the three tickets and the corresponding probabilities with `probs`

.

In [6]:

```
BoxModel(['D', 'R', 'I'], probs=[0.32, 0.27, 0.41], size=5).draw()
```

Out[6]:

The `probs`

argument requires that the probabilities are already normalized to sum to 1. Non-normalized values can be handled by entering the tickets as a dictionary, specifying the label on each ticket and the number of tickets in the box with that label. Note that a dictionary is enclosed in braces `{}`

rather than brackets `[]`

.

The following code is equivalent to the previous code which used the `probs`

option.

In [7]:

```
BoxModel({'D': 32,'R': 27, 'I': 41}, size=5).draw()
```

Out[7]:

By default `BoxModel`

assumes sampling with replacement; each ticket is placed back in the box before the next ticket is selected. Sampling *without replacement* can be handled with `replace=False`

. (The default is `replace=True`

.)

*Example.* Two people are selected at random from Anakin, Bella, Frodo, Harry, Katniss to go on a quest.

In [8]:

```
BoxModel(['A','B','F','H','K'], size=2, replace=False).draw()
```

Out[8]:

`BoxModel`

returns ordered outcomes, e.g. ('A', 'B') is distinct from ('B', 'A'). To return unordered outcomes, set `order_matters=False`

.

Symbulate has many common probability models built in. The `ProbabilitySpace`

command allows for user defined probability models. The first step in creating a probability space is to define a function that explains how to draw one outcome.

*Example.* Ten percent of all e-mail is spam. Thirty percent of spam e-mails contain the word "money", while 2% of non-spam e-mails contain the word "money". Suppose an e-mail contains the word "money". What is the probability that it is spam?

We can think of the sample space of outcomes of pairs of the possible email types (spam or not) and wordings (money or not), with the probability measure following the above specifications. First we draw from a `BoxModel`

to determine the email type. Then, depending on the result of the first draw, we draw from one of two `BoxModel`

s to determine the wording. The function `spam_sim`

below encodes these specifications; note the use of `.draw()`

.

In [9]:

```
def spam_sim():
email_type = BoxModel(["spam", "not spam"], probs=[.1, .9]).draw()
if email_type == "spam":
has_money = BoxModel(["money", "no money"], probs=[.3, .7]).draw()
else:
has_money = BoxModel(["money", "no money"], probs=[.02, .98]).draw()
return email_type, has_money
```

`ProbabilitySpace`

can be created once the specifications of the simulation have been defined through a function.

In [10]:

```
P = ProbabilitySpace(spam_sim)
P.draw()
```

Out[10]:

Symbulate has many commonly used probability spaces built in. Here are just a few examples.

In [11]:

```
Binomial(n=10, p=0.5).draw()
```

Out[11]:

In [12]:

```
Normal(mean=0, sd=1).draw()
```

Out[12]:

In [13]:

```
mean_vector = [0, 1, 2]
cov_matrix = [[1.00, 0.50, 0.25],
[0.50, 2.00, 0.00],
[0.25, 0.00, 4.00]]
MultivariateNormal(mean = mean_vector, cov = cov_matrix).draw()
```

Out[13]:

**Independent** probability spaces can be constructed by multiplying (`*`

in Python) two probability spaces. The product `*`

syntax reflects that under independence joint probabilities are products of marginal probabilities: For example, events $A$ and $B$ are independent if and only if $P(A\cap B) = P(A)P(B)$.

Multiple independent copies of a probability space can be created by raising a probability space to a power (`**`

in Python).

*Example.* Roll a fair six-sided die and a fair four-sided die.

In [14]:

```
die6 = list(range(1, 6+1, 1))
die4 = list(range(1, 4+1, 1))
rolls = BoxModel(die6) * BoxModel(die4)
rolls.draw()
```

Out[14]:

*Example.* A triple of independent outcomes

In [15]:

```
(BoxModel(['H', 'T']) * Poisson(lam=2) * Exponential(rate=5)).draw()
```

Out[15]:

*Example.* Four independent Normal(0,1) values.

In [16]:

```
P = Normal(mean=0, sd=1) ** 4
P.draw()
```

Out[16]:

Infinitely many independent copies of a probability space can be created by raising the probabilty space to the `inf`

power, i.e. `** inf`

*Example*. Infinitely many independent Normal(0, 1) values.

In [17]:

```
P = Normal(mean=0, sd=1) ** inf
P.draw()
```

Out[17]:

< Contents | Simulation tools >

In [ ]:

```
```