24  Expectations Involving Multiple Random Variables

In Chapter 23, we learned that the distribution of multiple continuous random variables could be described completely by the joint PDF. However, the joint PDF contains more information than is necessary for most problems. In this chapter, we will summarize random variables by calculating expectations of the form \(\text{E}[ g(X, Y) ]\). All of the results in this chapter are analogous to the results in Chapter 14 for discrete random variables, except with PDFs instead of PMFs and integrals instead of sums.

24.1 2D LotUS

The general tool for calculating expectations of the form \(\text{E}[ g(X, Y) ]\) is 2D LotUS. It is the natural generalization of LotUS from Theorem 21.1 and the continuous version of Theorem 14.1.

Theorem 24.1 (2D LotUS) Let \(X\) and \(Y\) be random variables with joint PDF \(f_{X,Y}(x,y)\). Then, for a function \(g: \mathbb{R}^2 \to \mathbb{R}\), \[ \text{E}[ g(X,Y) ] = \int_{-\infty}^\infty \int_{-\infty}^\infty g(x,y) f_{X,Y}(x,y) \, dx \, dy. \tag{24.1}\]

The intuition is the same as Theorem 21.1; the only difference is that there are now two random variables. To calculate the expectation of \(g(X, Y)\), we weight the possible values of \(g(x, y)\) by the joint PDF \(f_{X, Y}(x, y)\).

Example 24.1 (How far apart do Harry and Sally arrive?) In Example 23.1, we calculated the probability that Harry and Sally meet, \(P(|X - Y| < 15)\), if they each wait 15 minutes for the other. Instead, we can ask how far apart they arrive on average, \(\text{E}[ |X - Y| ]\).

By 2D LotUS (Theorem 24.1), this expectation is \[ \begin{aligned} \text{E}[ |X - Y| ] &= \int_{-\infty}^\infty \int_{-\infty}^\infty |x - y| f_{X, Y}(x, y) \, dx \, dy \\ &= \int_0^{30} \int_0^{60} |x - y| \cdot \frac{1}{1800}\,dx\,dy. \end{aligned} \]

This integral requires some finesse to evaluate because of the absolute value. We will break the integral up into two integrals over different regions:

  • \(S_- = \{ (x, y): x - y < 0 \}\)
  • \(S_+ = \{ (x, y): x - y \geq 0 \}\)

in order to eliminate the absolute value. These two regions are shown in Figure 24.1.

Figure 24.1: Breaking up the support into two regions \(S_-\) and \(S_+\).

Now, we can write \(|x - y|\) as \(x - y\) on \(S_+\) and \(y - x\) on \(S_-\).

\[ \begin{aligned} \text{E}[ |X - Y| ] &= \iint_{S_+} (x - y) \frac{1}{1800} \,dy\,dx + \iint_{S_-} (y-x) \frac{1}{1800} \,dy\,dx \\ &= \frac{1}{1800} \int_0^{30} \underbrace{\int_0^{x} (x - y) \,dy}_{xy - \frac{y^2}{2}\Big|_{y=0}^{y=x}} \,dx + \frac{1}{1800} \int_0^{30} \underbrace{\int_{x}^{60} (y - x) \,dy}_{\frac{y^2}{2} - xy\Big|_{y=x}^{y=60}} \,dx \\ &= \frac{1}{1800} \int_0^{30} \frac{x^2}{2}\,dx + \frac{1}{1800} \int_0^{30} 1800 - 60x + \frac{x^2}{2} \,dx \\ &= 2.5 + 17.5 \\ &= 20. \end{aligned} \]

Therefore, we expect Harry and Sally to arrive 20 minutes apart.

Because Equation 24.1 is usually cumbersome to evaluate, 2D LotUS is usually a tool of last resort. The remainder of this chapter is devoted to shortcuts for specific functions \(g(x, y)\) that allow us to avoid 2D LotUS. But when in doubt, remember that 2D LotUS is always an option.

24.2 Linearity of Expectation

When \(g(x, y)\) is a linear function, there is a remarkable simplification.

Theorem 24.2 (Linearity of Expectation) Let \(X\) and \(Y\) be random variables. Then, \[ \text{E}[ X + Y ] = \text{E}[ X ] + \text{E}[ Y ]. \]

Proof

Using 2D LotUS with \(g(x,y) = x + y\), we see that \[\begin{align*} \text{E}[ X + Y ] &= \int_{-\infty}^\infty \int_{-\infty}^\infty (x+y) \, f_{X,Y}(x,y) \, dx \, dy \\ &= \int_{-\infty}^\infty \int_{-\infty}^\infty x f_{X,Y}(x,y) \, dx \, dy + \int_{-\infty}^\infty \int_{-\infty}^\infty y f_{X,Y}(x,y) \, dx \, dy \\ &= \int_{-\infty}^\infty \int_{-\infty}^\infty xf_{X,Y}(x,y) \, dy \, dx + \int_{-\infty}^\infty \int_{-\infty}^\infty y f_{X,Y}(x,y) \, dx \, dy \\ &= \int_{-\infty}^\infty x \int_{-\infty}^\infty f_{X,Y}(x,y) \, dy \, dx + \int_{-\infty}^\infty y \int_{-\infty}^\infty f_{X,Y}(x,y) \, dx \, dy \\ &= \int_{-\infty}^\infty x f_X(x) \, dx + \int_{-\infty}^\infty y f_Y(y) \, dy \\ &= \text{E}[ X ] + \text{E}[ Y ]. \end{align*}\]

This result is more remarkable than it appears. It says that \(\text{E}[ X + Y ]\), which depends in principle on the joint distribution of \(X\) and \(Y\), can be calculated using only the distribution of \(X\) and the distribution of \(Y\) individually. That is, no matter how \(X\) and \(Y\) are related to each other, \(\text{E}[ X + Y ]\) is the same value.

By cleverly applying linearity of expectation, we can solve Example 24.1 without any double integrals!

Example 24.2 (How far apart do Harry and Sally arrive? (Linearity Version)) In Example 24.1, we calculated \(\text{E}[ |X - Y| ]\) using 2D LotUS. Of course, \(g(x, y) = |x - y|\) is not a linear function of \(x\) and \(y\), so we cannot apply linearity of expectation directly.

However, we can express the absolute difference \(|X - Y|\) as \(M - L\), where

  • \(L \overset{\text{def}}{=}\min(X, Y)\) is the lesser of the two numbers (i.e., the time the first person arrives), and
  • \(M \overset{\text{def}}{=}\max(X, Y)\) is the greater of the two numbers (i.e., the time the second person arrives).

Therefore, by linearity of expectation (Theorem 24.2),

\[ \text{E}[ |X - Y| ] = \text{E}[ M - L ] = \text{E}[ M ] - \text{E}[ L ], \] so we can evaluate the expectation by evaluating the expectations of \(M\) and \(L\) individually.

Determining \(\text{E}[ M ]\) or \(\text{E}[ L ]\) requires some work. To calculate \(\text{E}[ M ]\), we will first need to calculate the distribution of \(M\). We will use the strategy from Section 20.1 and first calculate its CDF. The key trick is that the event \(\{ M \leq m \}\) is equivalent to the event that both \(X\) and \(Y\) are less than or equal to \(m\), and \(X\) and \(Y\) are independent.

\[ \begin{align} F_M(m) &= P(M \leq m) & \text{(definition of CDF)} \\ &= P(X \leq m, Y \leq m) & \text{(equivalent events)} \\ &= P(X \leq m) P(Y \leq m) & \text{(independence of $X$ and $Y$)} \\ &= \begin{cases} 0 & m \leq 0 \\ \frac{m}{30} \cdot \frac{m}{60} & 0 < m < 30 \\ \frac{m}{60} & 30 \leq m < 60 \\ 1 & 60 \leq m \end{cases}. & \text{(CDF of uniform distribution)} \\ \end{align} \]

The PDF of \(M\) is the derivative of the CDF. \[ f_M(m) = \begin{cases} \frac{m}{900} & 0 < m < 30 \\ \frac{1}{60} & 30 \leq m < 60 \\ 0 & \text{otherwise} \end{cases}. \tag{24.2}\]

The PDF Equation 24.2 is graphed in Figure 24.2. This is not one of the named distributions, but its shape makes sense. More of the probability is concentrated at the higher end, which makes sense because \(M\) represents the time that the second person arrives.

Figure 24.2: PDF of the maximum \(M\).

Now that we have the PDF of \(M\), we can calculate \(\text{E}[ M ]\) by integrating: \[ \text{E}[ M ] = \int_{-\infty}^\infty m f_M(m)\,dm = \int_0^{30} m \frac{m}{900}\,dm + \int_{30}^{60} m\frac{1}{60}\,dm = 32.5.\]

To calculate \(\text{E}[ L ]\), we could follow the same process. (Exercise 24.1 asks you to work out the details.) However, once we know one of \(\text{E}[ M ]\) or \(\text{E}[ L ]\), we can determine the other using linearity of expectation.

Here’s how. First, observe that \(M + L\) is simply the sum of \(X\) and \(Y\) in some order; that is, \(M + L = X + Y\). Therefore, \[ \begin{aligned} \text{E}[ M + L ] &= \text{E}[ X + Y ] \\ &= \text{E}[ X ] + \text{E}[ Y ] & \text{(linearity of expectation)} \\ &= \frac{0 + 30}{2} + \frac{0 + 60}{2} & \text{(expectation of a uniform)} \\ &= 45 \end{aligned} \tag{24.3}\]

But again by linearity of expectation, \(\text{E}[ M + L ] = \text{E}[ M ] + \text{E}[ L ]\), so Equation 24.3 tells us that \[ \text{E}[ M ] + \text{E}[ L ] = 45. \] Since we already know that \(\text{E}[ M ] = 32.5\), we can solve for the other expectation: \(\text{E}[ L ] = 12.5\).

Now we can simply subtract the two values to obtain \[ \text{E}[ |X - Y| ] = \text{E}[ M - L ] = \text{E}[ M ] - \text{E}[ L ] = 32.5 - 12.5 = 20, \] which is the same answer as in Example 24.1 without any double integrals!

24.3 Expectation of Products

When \(g(x, y) = xy\), evaluating \(\text{E}[ g(X, Y) ] = \text{E}[ XY ]\) requires 2D LotUS in general. However, when \(X\) and \(Y\) are independent, we can break up the expectation.

Theorem 24.3 (Expectation of a product of independent random variables) If \(X\) and \(Y\) are independent random variables, then \[ \text{E}[ XY ] = \text{E}[ X ] \text{E}[ Y ]. \] Moreover, for functions \(g\) and \(h\), \[ \text{E}[ g(X) h(Y) ] = \text{E}[ g(X) ] \text{E}[ h(Y) ]. \]

Proof

Using 2D LotUS, we see that \[\begin{align*} \text{E}[ XY ] &= \int_{-\infty}^\infty \int_{-\infty}^\infty xy f_{X,Y}(x,y) \, dx \, dy \\ &= \int_{-\infty}^\infty \int_{-\infty}^\infty xy f_X(x) f_Y(y) \, dx \, dy & (\text{$X$ and $Y$ are independent}) \\ &= \int_{-\infty}^\infty y f_Y(y) \left[\int_{-\infty}^\infty x f_X(x) \, dx \right] \, dy & (\text{pull out constants with respect to $x$}) \\ &= \left[\int_{-\infty}^\infty x f_X(x) \, dx \right] \int_{-\infty}^\infty y f_Y(y) \, dy & (\text{pull out constants with respect to $y$}) \\ &= \text{E}[ X ] \text{E}[ Y ]. \end{align*}\]

The proof of the second part is similar.

Example 24.3 (Expected Product) Let \(X\) and \(Y\) be the times that Harry and Sally arrive, as in Example 23.1. Let \(L\) and \(M\) be the times that the first and second person arrive, as in Example 24.2.

What is \(\text{E}[ LM ]\), the expected product of the time that the first person arrives and the time the second person arrives?

We cannot apply Theorem 24.3 directly because \(L\) and \(M\) are not independent. If we know that the first person arrived at \(L = 12\) minutes past noon, then the second person cannot have arrived before that.

However, we can observe that \(LM\) is simply the product of Harry’s time \(X\) and Sally’s time \(Y\) in some order. That is, \(LM = XY\), where \(X\) and \(Y\) are independent. Therefore, \[ \text{E}[ LM ] = \text{E}[ XY ] = \text{E}[ X ] \text{E}[ Y ] = (15)(30) = 450. \]

On the other hand, we calculated \(\text{E}[ L ]\) and \(\text{E}[ M ]\) in Example 24.2, and \[ \text{E}[ L ] \text{E}[ M ] = (12.5) (32.5) = 406.25 \neq \text{E}[ LM ]. \]

Why should we care about \(\text{E}[ LM ]\), the expected product of the times that the first person and the second person arrive? It turns out to be useful for summarizing the relationship between \(L\) and \(M\). We take up this issue in Chapter 25.

24.4 Exercises

Exercise 24.1 In Example 24.2, we derived the PDF of \(M = \max(X, Y)\), the time the second person arrives. Using a similar argument, derive the PDF of \(L = \min(X, Y)\), the time the first person arrives. Then, using this PDF, calculate \(\text{E}[ L ]\), and check that it matches the answer we obtained in Example 24.2.

Hint: When calculating the CDF of \(L\), it helps to use the complement rule.

Exercise 24.2 Let \(X \sim \textrm{Exponential}(\lambda=\lambda_X)\) and \(Y \sim \textrm{Exponential}(\lambda=\lambda_Y)\) be independent random variables. Derive the distribution of \(L \overset{\text{def}}{=}\min(X, Y)\). It is one of the named distributions that we learned. Then, use this fact to derive \(\text{E}[ M ]\), where \(M \overset{\text{def}}{=}\max(X, Y)\), without any calculus.