Lesson 8 Law of Total Probability

Motivating Example

You watch a magician place 4 ordinary quarters and 1 double-headed quarter into a box. If you select a coin from the box at random and toss it, what is the probability that it lands heads?

Theory

In some situations, calculating the probability of an event is easy, once you condition on the right information. For example, in the example above, if we knew that the coin we chose was ordinary, then: \[ P(\text{heads} | \text{ordinary}) = \frac{1}{2}. \] On the other hand, if the coin we chose was double-headed, then \[ P(\text{heads} | \text{double-headed}) = 1. \]

How do we combine these conditional probabilities to come up with \(P(\text{heads})\), the overall probability that the coin lands heads?

Theorem 8.1 (Law of Total Probability) Let \(A_1, ..., A_n\) be a partition of the possible outcomes. Then: \[ P(B) = \sum_{i=1}^n P(A_i) P(B | A_i). \]

A partition is a collection of non-overlapping events that cover all the possible outcomes. For example, in the example above, \(A_1 = \{ \text{ordinary}\}\) and \(A_2 = \{ \text{double-headed} \}\) is a partition, since the coin that we selected has to be one of the two and cannot be both.

Applying the Law of Total Probability to this problem, we have \[\begin{align*} P(\text{heads}) &= P(\text{ordinary}) P(\text{heads} | \text{ordinary}) + P(\text{double-headed}) P(\text{heads} | \text{double-headed}) \\ &= 0.8 \cdot \frac{1}{2} + 0.2 \cdot 1 \\ &= 0.6 \end{align*}\]

So the overall probability \(P(B)\) is just a weighted average of the conditional probabilities \(P(B | A_i)\), where the “weights” are \(P(A_i)\). (Note that these “weights” have to add up to 1, since \(A_1, ..., A_n\) are a partition of all the possible outcomes, whose total probability is \(1.0\).) This means that the overall probability \(P(B)\) will always lie somewhere between the conditional probabilities \(P(B | A_i)\), with more weight given to the more probable scenarios. For example, we could have predicted that \(P(\text{heads})\) would lie somewhere between \(\frac{1}{2}\) and \(1\); the fact that it is much closer to the former is because choosing the ordinary coin was more probable.

Examples

  1. Here are some things we already know about a deck of cards:
    • The top card in a shuffled deck of cards has a \(13/52\) chance of being a diamond.
    • If the top card is a diamond, then the second card has a \(12/51\) chance of being a diamond.
    • If the top card is not a diamond, then the second card has a \(13/51\) chance of being a diamond.

    Now, suppose we “burn” (i.e., discard) the top card without looking at it. What is the probability that the second card is a diamond? Use the Law of Total Probability, conditioning on the top card. Does burning cards affect probabilities?

  2. Anny is a fan of chess competitor Hikaru Nakamura, and tomorrow is the World Chess Championship. She is superstitious and believes that the weather influences how he will perform. Hikaru has a 60% chance of winning if it rains, a 25% chance if it is cloudy, and a 10% chance if it is sunny. Anny checks the weather the night before, and the forecast says that the chance of rain tomorrow is 40%; otherwise, it is equally likely to be cloudy as sunny. What is the probability that Hikaru wins the World Chess Championship?