Lesson 24 LOTUS

Motivating Example

In Lesson 23, we analyzed the St. Petersburg Paradox. There, we calculated the expected amount we win, \(E[W]\), by first deriving the p.m.f. of \(W\).

However, we know that the amount we win, \(W\), is related to the number of tosses, \(N\), by \[ W = 2^{N-1}. \] Furthermore, we know that \(N\) follows a \(\text{Geometric}(p=1/2)\) distribution. Can we calculate the expected amount we win, \(E[W] = E[2^{N-1}]\), from the p.m.f. of \(N\), without deriving the p.m.f. of \(W\)?


In this lesson, we will learn how to calculate expected values of functions of random variables. That is, we will calculate expected values of the form \(E[g(X)]\). There are two ways to do this:

  1. Calculate the p.m.f. of \(Y = g(X)\), then calculate \(E[Y]\) from the usual formula (22.1).
  2. Use the Law of the Unconscious Statistician (LOTUS), described below.
Theorem 24.1 (LOTUS) Let \(X\) be a random variable with p.m.f. \(f_X(x)\). Define \(Y = g(X)\) for some function \(g\). Then, \(E[Y] = E[g(X)]\) is \[\begin{equation} E[g(X)] = \sum_x g(x) \cdot f_X(x). \tag{24.1} \end{equation}\]

Theorem 24.1 allows us to calculate the expected value of \(Y = g(X)\), without first finding its distribution. Instead, we can just use the known distribution of \(X\).

This result is called the “Law of the Unconscious Statistician” because many people intuitively assume it is true. Remember that \(E[g(X)]\) represents the “average” value of \(g(X)\). To calculate the average value of \(g(X)\), it makes sense to take a weighted average of the possible values \(g(x)\), where the weights are the probabilities \(f_X(x)\).

Let’s start with a simple example where \(E[g(X)]\) is easy to calculate, to understand why LOTUS works.

Example 24.1 (Random Circle) We toss a fair coin twice. Let \(X\) be the number of heads. Then, the p.m.f. of \(X\) is \[ \begin{array}{r|ccc} x & 0 & 1 & 2 \\ \hline f_X(x) & .25 & .50 & .25 \end{array} \]

Now, suppose we sketch a circle whose radius is \(X\) (in feet), the random number we just generated by tossing the coin. Then, the area of this circle is a random variable \(A = \pi X^2\) (in square feet).

What is the expected area \(E[A] = E[\pi X^2]\)? Clearly, the only possible values of \(\pi X^2\) are \[\begin{align*} \pi \cdot 0^2 &= 0, & \pi \cdot 1^2 &= \pi, & \text{ and } \pi \cdot 2^2 &= 4\pi, \end{align*}\] and their probabilities are just the probabilities of \(0\), \(1\), and \(2\), respectively. That is, the p.m.f. of \(A\) is \[ \begin{array}{r|ccc} a & \pi \cdot 0^2 & \pi \cdot 1^2 & \pi\cdot 2^2 \\ \hline f_A(a) & .25 & .50 & .25 \end{array} \] Therefore, the expected area must be \[\begin{align} E[\pi X^2] &= (\pi \cdot 0^2) \cdot .25 + (\pi \cdot 1^2) \cdot .50 + (\pi \cdot 2^2) \cdot .25 \\ &= 1.5 \pi. \tag{24.2} \end{align}\] Notice that we weighted the values of \(g(x) = \pi x^2\) by the p.m.f. of \(X\) to calculate \(E[g(X)]\). This is exactly what LOTUS (Theorem 24.1) said we should do!

Notice that we get a different answer if we first evaluate the expected radius and the calculate the area of a circle with that radius: \[ \pi E[X]^2 = \pi \cdot 1^2 = \pi. \] An average circle is not the same as a circle with an average radius!

In general, \(E[g(X)] \neq g(E[X])\). In the first case, you have to use LOTUS. In the second case, you first calculate the expected value, then apply the function \(g\) to the result.

Now let’s look at the question posed at the beginning of the lesson.

Example 24.2 (St. Petersburg Paradox Revisited) We know that the p.m.f. of \(N\), a \(\text{Geometric}(p=0.5)\) random variable, is \[ f_N(n) = (1-0.5)^{n-1} 0.5 = 0.5^n. \] By LOTUS, \[\begin{align*} E[2^{N-1}] &= \sum_{n=1}^\infty 2^{n-1} \cdot (0.5)^n \\ &= \sum_{n=1}^\infty (0.5) \\ &= 0.5 + 0.5 + 0.5 + \ldots \\ &= \infty, \end{align*}\] which matches answer we got in Lesson 23.

Here is a more complex application of LOTUS. This particular expected value may seem unmotivated, but it will come in handy later when we talk about variance.

Example 24.3 Let \(X\) be a \(\text{Binomial}(n, N_1, N_0)\) random variable. In Example 22.2, we showed that \(E[X] = n \frac{N_1}{N}\). Now, we calculate \(E[X(X-1)]\) by applying LOTUS (24.1) to the binomial p.m.f. \[ f(x) = \binom{n}{x} \frac{N_1^x N_0^{n-x}}{N^n}, x=0, 1, \ldots, n. \] \[\begin{align*} E[X(X-1)] &= \sum_{x=0}^n x (x-1) \cdot \binom{n}{x} \frac{N_1^x N_0^{n-x}}{N^n} \\ &= \sum_{x=2}^n x(x-1) \cdot \binom{n}{x} \frac{N_1^x N_0^{n-x}}{N^n}, \end{align*}\] where the only change from line 1 to line 2 was to start the sum at \(x=2\) instead of \(x=0\). (We can do this because the summand is 0 when \(x=0\) and \(x=1\). Try plugging in \(x=0\) and \(x=1\) if you do not see this.)

Next, we replace \(x(x-1) \cdot \binom{n}{x}\) using the combinatorial identity \[ x(x-1) \binom{n}{x} = n (n-1) \binom{n-2}{x-2}. \] Here is a story proof ot this identity: imagining selecting a committee of \(x\) people from \(n\), where one person is the chair and another person is the vice-chair. We can either:

  1. select the committee first (\(\binom{n}{x}\)) and then select the chair and vice-chair (\(x(x-1)\)), or
  2. select the chair and vice-chair first (\(n(n-1)\)) and then select the rest of the committee (\(\binom{n-2}{x-2}\)).

Since these are two equivalent methods of selecting a committee with a chair and a vice-chair, the two expressions must be equal.

\[\begin{align*} &= \sum_{x=2}^n n (n-1) \binom{n-2}{x-2} \frac{N_1^x N_0^{n-x}}{N^n} \\ &= n(n-1)\sum_{x=2}^n \binom{n-2}{x-2} \frac{N_1^x N_0^{n-x}}{N^n} & (\text{pull $n(n-1)$ outside the sum}) \\ &= n(n-1) \sum_{x'=0}^{n-2} \binom{n-2}{x'} \frac{N_1^{x' + 2} N_0^{n - 2 - x'}}{N^n} & (\text{apply substitution $x' = x - 2$}) \\ &= n(n-1) \frac{N_1^2}{N^2} \sum_{x'=0}^{n-2} \underbrace{\binom{n-2}{x'} \frac{N_1^{x'} N_0^{n - 2 - x'}}{N^{n-2}}}_{\text{p.m.f. of $\text{Binomial}(n-2, N_1, N_0)$}} & (\text{pull factors of $N_1$ and $N$ outside the sum}) \\ &= n(n-1) \frac{N_1^2}{N^2} & (\text{sum of p.m.f. over all possible values is 1}) \end{align*}\]

Essential Practice

  1. Suppose we generate a random length \(L\) (in inches) from the p.m.f.

    \(\ell\) 1 2 3
    \(f(\ell)\) .2 .5 .3

    and draw a square with that sidelength. Calculate \(E[L]^2\) and \(E[L^2]\). Are they the same? Which one represents the expected area of the square we drew?

  2. Let \(X\) be a \(\text{Poisson}(\mu)\) random variable. Calculate \(E[X(X-1)]\).

  3. Let \(X\) be a \(\text{Geometric}(p)\) random variable. Let \(t\) be a constant. Calculate \(M(t) = E[e^{tX}]\) as a function of \(t\). Statisticians call this the moment generating function of \(X\), while engineers may recognize this function as the Laplace transform of the p.m.f. of \(X\).

Additional Exercises

  1. Another resolution to the St. Petersburg Paradox is to consider expected utility \(U\) rather than expected wealth \(W\). (“Utility” is the term that economists use for “happiness”.) Because of diminishing marginal utility, the first million dollars is worth more than the next million dollars. One way to model diminishing marginal utility is to assume that \(U = \log(W)\). Show that the expected utility of the St. Petersburg game is finite, even though the expected winnings is infinite.

  2. Let \(X\) be a \(\text{Hypergeometric}(n, N_1, N_0)\) random variable. Calculate \(E[X(X-1)]\).

  3. Let \(X\) be a \(\text{Poisson}(\mu)\) random variable for \(0 < \mu < 1\). Calculate \(E[X!]\).

  4. Let \(X\) be a \(\text{NegativeBinomial}(r, p)\) random variable. Calculate \(E[(X+1)X]\).