Lesson 21 Sums of Random Variables
Theory
Let \(X\) and \(Y\) be random variables. What is the distribution of their sum—that is, the random variable \(T = X + Y\)?
In principle, we already know how to calculate this. To determine the distribution of \(T\), we need to calculate \[ f_T(t) \overset{\text{def}}{=} P(T = t) = P(X + Y = t), \] which we can do by summing the joint p.m.f. over the appropriate values: \[\begin{equation} \sum_{(x, y):\ x + y = t} f(x, y). \tag{21.1} \end{equation}\]
For example, to calculate the total number of bets that Xavier and Yolanda win, we calculate \(P(X + Y = t)\) for \(t = 0, 1, 2, \ldots, 8\). The probabilities that we would need to sum for \(t=4\) are highlighted in the joint p.m.f. table below: \[ \begin{array}{rr|cccc} & 5 & 0 & 0 & 0 & .0238 \\ & 4 & \fbox{0} & 0 & .0795 & .0530 \\ y & 3 & 0 & \fbox{.0883} & .1766 & .0294 \\ & 2 & .0327 & .1963 & \fbox{.0981} & 0 \\ & 1 & .0726 & .1090 & 0 & \fbox{0} \\ & 0 & .0404 & 0 & 0 & 0 \\ \hline & & 0 & 1 & 2 & 3\\ & & & & x \end{array}. \]
For a fixed value of \(t\), \(x\) determines the value of \(y\) (and vice versa). In particular, \(y = t - x\). So we can write (21.1) as a sum over \(x\): \[\begin{equation} f_T(t) = \sum_x f(x, t-x). \tag{21.2} \end{equation}\]
This is the general equation for the p.m.f. of the sum \(T\). If the random variables are independent, then we can actually say more.
Proof. We apply Theorem 21.1 to binomial p.m.f.s.
\[\begin{align*} f_T(t) &= \sum_{x=0}^t f_X(x) \cdot f_Y(t-x) \\ &= \sum_{x=0}^t \binom{n}{x} p^x (1-p)^{n-x} \cdot \binom{m}{t-x} p^{t-x} (1-p)^{m-(t-x)} \\ &= \sum_{x=0}^t \binom{n}{x} \binom{m}{t-x} p^t (1-p)^{n+m-t} \\ &= \binom{n+m}{t} p^t (1-p)^{n+m-t}, \end{align*}\] which is the p.m.f. of a \(\text{Binomial}(n + m, p)\) random variable.
In the last equality, we used the fact that \[\begin{equation} \sum_{x=0}^t \binom{n}{x} \binom{m}{t-x} = \binom{n+m}{t}. \tag{21.4} \end{equation}\] This equation is known as Vandermonde’s identity. One way to see it is to observe \[ \sum_{x=0}^t \frac{\binom{n}{x} \binom{m}{t-x}}{\binom{n+m}{t}} = 1, \] since we are summing the p.m.f. of a \(\text{Hypergeometric}(t, n, m)\) random variable over all of its possible values \(0, 1, 2, \ldots, t\). Now, if we multiply both sides of this equality by \[ \binom{n+m}{t}, \] we obtain Vandermonde’s identity (21.4).
However, we can see that the sum of two independent binomials must be binomial another way. \(X\) represents the number of \(\fbox{1}\)s in \(n\) draws with replacement from a box. \(Y\) represents the number of \(\fbox{1}\)s in \(m\) separate draws with replacement from the same box:
- The draws must be separate because we need \(X\) to be independent of \(Y\).
- We can use the same box because \(p\) (which corresponds to how many \(\fbox{1}\)s and \(\fbox{0}\)s there are in the box) is the same for \(X\) and \(Y\).
Essential Practice
Let \(X\) and \(Y\) be independent \(\text{Poisson}(\mu)\) and \(\text{Poisson}(\nu)\) random variables. Use convolution to find the distribution of \(X + Y\). (Hint: It is a named distribution.) Then, by making an analogy to a Poisson process, explain why this must be the distribution of \(X + Y\).
(The binomial theorem will come in handy: \(\sum_{x=0}^n \binom{n}{x} a^x b^{n-x} = (a + b)^n\).)
Let \(X\) and \(Y\) be independent \(\text{Geometric}(p)\) random variables. Use convolution to find the distribution of \(X + Y\). (Hint: It is a named distribution. It may help to remember that \(\sum_{i=1}^m 1 = m = \binom{m}{1}\).) Then, by making an analogy to a box model, explain why this has to be the distribution of \(X + Y\).
Give an example of two \(\text{Binomial}(n=3, p=0.5)\) random variables \(X\) and \(Y\), where \(T = X + Y\) does not follow a \(\text{Binomial}(n=6, p=0.5)\) distribution. Why does this not contradict Theorem 21.2?
Additional Exercises
- Let \(X\) and \(Y\) be independent random variables with the p.m.f.
\(x\) | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
\(f(x)\) | \(1/6\) | \(1/6\) | \(1/6\) | \(1/6\) | \(1/6\) | \(1/6\) |
Use convolution to find the p.m.f. of \(T = X + Y\). Why does the answer make sense? (Hint: \(X\) and \(Y\) represent the outcomes when you roll two fair dice.)