33  Convolutions

In Chapter 31 and Chapter 32, we discussed how an estimator \(\hat\theta\) can be viewed as a random variable. Its expectation \(\text{E}\!\left[ \hat\theta \right]\) and variance \(\text{Var}\!\left[ \hat\theta \right]\) are useful for evaluating the estimator (e.g., bias, variance, MSE). In the coming chapters, we will discuss various strategies for calculating the distributions of common estimators, such as

  1. \(\bar X\) (Example 30.3, Example 32.4)
  2. \(\max(X_1, \dots, X_n)\) (Section 31.1)
  3. median of \(X_1, \dots, X_n\) (Exercise 30.6)

In the next few chapters, we will focus on the distribution of the sample mean \(\bar X\) of i.i.d. random variables. Since \[ \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i, \] so the problem of determining the distribution of a sample mean is closely related to the problem of determining the distribution of a sum of random variables \[ S_n = \sum_{i=1}^n X_i,\] since \(\bar{X}\) is simply a scaling of \(S_n\).

The process used to determine the distribution of a sum is called convolution. Convolution operates on two independent random variables at a time. That is, to determine the distribution of \(S_n = \sum_{i=1}^n X_i\),

  1. we first determine the distribution of \(S_2 = X_1 + X_2\),
  2. then determine the distribution of \(S_3 = (X_1 + X_2) + X_3 = S_2 + X_3\),
  3. then determine the distribution of \(S_4 = (X_1 + X_2 + X_3) + X_4 = S_3 + X_4\),

and so on. Once we know how to calculate the distribution of the sum of two independent random variables, we can calculate the distribution of the sum of any number of independent random variables.

33.1 Discrete Convolutions

Let \(X\) and \(Y\) be independent discrete random variables with PMFs \(f_X(x)\) and \(f_Y(y)\), respectively. The method for determining the PMF of \(S = X + Y\) is straightforward in theory but difficult in practice.

Since the PMF of \(S\) is \[ f_S(s) = P(S = s) = P(X + Y = s), \] we just need to evaluate this probability for all possible values of \(s\). When \(X\) and \(Y\) are both integer-valued, the event \(\left\{ X + Y = s \right\}\) can be expressed as a disjoint union of all the different possibilities of \(X\)-values and \(Y\)-values that yield a sum of \(s\): \[ \cdots \cup \left\{ X = 0, Y = s \right\} \cup \left\{ X = 1, Y = s-1 \right\} \cup \left\{ X = 2, Y = s-2 \right\} \cup \cdots. \]

Therefore, the PMF of \(S\) is \[ \begin{align*} f_S(s) = P(X + Y = s) &= \sum_{x = -\infty}^\infty P(X = x, Y = s-x) \\ &= \sum_{x=-\infty}^\infty P(X = x) P(Y = s - x) \\ &= \sum_{x=-\infty}^\infty f_X(x) f_Y(s - x), \end{align*} \] where we used independence of \(X\) and \(Y\) in the second step.

This method is summarized in the following theorem.

Theorem 33.1 (Discrete convolution) Let \(X\) and \(Y\) be independent discrete random variables with PMFs \(f_X(x)\) and \(f_Y(y)\), respectively. Then, the PMF of \(S = X + Y\) is \[ f_S(s) = \sum_x f_X(x) f_Y(s - x). \tag{33.1}\]

\(f_S\) is said to be the discrete convolution of \(f_X\) and \(f_Y\) and is denoted \[ f_S = f_X * f_Y. \]

We now apply Theorem 33.1 to an example.

Example 33.1 (Convolution of independent Geometrics) Let \(X\) and \(Y\) be i.i.d. \(\text{Geometric}(p)\) random variables. What is the distribution of \(S = X + Y\)?

Since the possible values of \(X\) and \(Y\) are \(1, 2, \dots\), the possible values of \(S\) are \(2, 3, \dots\). Fix an integer \(s \geq 2\). Then, in order for the sum of \(X\) and \(Y\) to be equal to \(s\), \(X\) must be at most \(s - 1\). \[ \begin{align*} f_S(s) &= \sum_{x=1}^{s-1} f_X(x) f_Y(s-x) \\ &= \sum_{x=1}^{s-1} \left( (1-p)^{x-1} p \right) \left( (1-p)^{s-x-1} p \right) \\ &= \sum_{x=1}^{s-1} (1-p)^{s-2} p^2 \\ &= (s-1) (1-p)^{s-2} p^2 \\ &= \binom{s - 1}{2 - 1} p^2 (1-p)^{s-2}. \end{align*} \]

This is the PMF of \(\text{NegativeBinomial}(2,p)\) (Definition 12.2), and so, \[ S = X + Y \sim \text{NegativeBinomial}(2,p). \] This is in line with our motivation of a negative binomial random variable as the sum of independent geometric random variables in Section 12.3.

We can apply Theorem 33.1 to independent random variables that are not identically distributed.

Example 33.2 (Convolution of independent Poissons) Let \(X \sim \text{Poisson}(\mu_1)\) and \(Y \sim \text{Poisson}(\mu_2)\) be independent. What is the distribution of \(S = X+Y\)?

The possible values of \(X\) and \(Y\) are \(0, 1, 2, \dots\), so the possible values of \(S\) are also \(0, 1, 2, \dots\). Fix an integer \(s \geq 0\). Then, in order for the sum of \(X\) and \(Y\) to be equal to \(s\), \(X\) cannot be greater than \(s\). \[ \begin{align*} f_S(s) &= \sum_{x=0}^s f_X(x) f_Y(s - x) \\ &= \sum_{x=0}^s \left( \frac{\mu_1^x e^{-\mu_1}}{x!} \right) \left( \frac{\mu_2^{s-x} e^{-\mu_2}}{(s-x)!} \right) \\ &= \sum_{x=0}^s e^{-\mu_1 - \mu_2} \frac{\mu_1^x \mu_2^{s-x}}{x!(s-x)!} \\ &= e^{-\mu_1 - \mu_2} \sum_{x=0}^s \frac{1}{x!(s-x)!} \mu_1^x \mu_2^{s-x} \\ &= e^{-\mu_1 - \mu_2} \sum_{x=0}^s \frac{s!}{x!(s-x)!} \frac{\mu_1^x \mu_2^{s-x}}{s!} \\ &= e^{-\mu_1 - \mu_2} \sum_{x=0}^s \binom{s}{x} \frac{\mu_1^x \mu_2^{s-x}}{s!} \\ &= \frac{e^{-\mu_1 - \mu_2}}{s!} \sum_{x=0}^s \binom{s}{x} \mu_1^x \mu_2^{s-x} \\ &= \frac{e^{-\mu_1 - \mu_2}}{s!} (\mu_1 + \mu_2)^s, \end{align*}\] which is the PMF of \(\text{Poisson}(\mu_1 + \mu_2)\). Thus, \[ S = X+Y \sim \text{Poisson}(\mu_1 + \mu_2). \]

In other words, the sum of two independent Poissons is another Poisson distribution with the parameters added. An immediate consequence of Example 33.2 is that if \(X_1, \dots, X_n\) are i.i.d. \(\text{Poisson}(\mu)\), then \[ X_1 + \cdots + X_n \sim \text{Poisson}(n \mu). \]

33.2 Continuous Convolutions

Let \(X\) and \(Y\) be independent continuous random variables. The process of finding the PDF of \(S = X+Y\) is similar to Theorem 33.1, except with PMFs replaced by PDFs and sums replaced by integrals.

Theorem 33.2 (Continuous convolution) Let \(X\) and \(Y\) be independent continuous random variables with PDFs \(f_X(x)\) and \(f_Y(y)\), respectively. Then, the PDF of \(S = X + Y\) is \[ f_S(s) = \int_{-\infty}^\infty f_X(x) f_Y(s - x)\,dx. \tag{33.2}\]

\(f_S\) is said to be the convolution of \(f_X\) and \(f_Y\) and is denoted \[ f_S = f_X * f_Y. \]

Proof

In Chapter 20, we discussed a general way to find the distribution of a transformed random variable. We apply that same recipe here:

  • Compute the CDF \(F_S(s)\)
  • Differentiate (with respect to \(s\)) to get the PDF \(f_S(s)\).

Once we have the PDF, we can compute probabilities related to \(S = X + Y\). In general, \[\begin{align*} F_S(s) &= P(X + Y \leq s) \\ &= \iint_{x+y \leq s} f_{X,Y}(x,y) \, dy \, dx \\ &= \iint_{x+y \leq s} f_X(x) f_Y(y) \, dy \, dx \qquad \qquad \text{($X$ and $Y$ are independent)} \\ &= \int_{-\infty}^\infty \int_{-\infty}^{s-x} f_X(x) f_Y(y) \, dy \, dx \\ &= \int_{-\infty}^\infty f_X(x) \int_{-\infty}^{s-x} f_Y(y) \, dy \, dx \\ &= \int_{-\infty}^\infty f_X(x) \left( F_Y(y) \right) \Biggr|_{-\infty}^{s-x} \, dx \\ &= \int_{-\infty}^\infty f_X(x) F_Y(s-x) \, dx. \end{align*}\]

Now, we differentiate \(F_S\) to obtain the PDF \(f_S\). \[\begin{align*} f_S(s) &= \frac{d}{ds} \int_{-\infty}^\infty f_X(x) F_Y(s-x) \, dx \\ &= \int_{-\infty}^\infty f_X(x) \frac{d}{ds} F_Y(s-x) \, dx \\ &= \int_{-\infty}^\infty f_X(x) f_Y(s-x) \, dx. \end{align*}\]

Now we apply Theorem 33.2 to several examples. The first example illustrates how tedious the computations can be in practice.

Example 33.3 (Convolution of independent standard normals) Let \(X\) and \(Y\) be i.i.d. \(\textrm{Normal}(\mu= 0, \sigma^2= 1)\). What is the distribution of \(S = X+Y\)?

By Theorem 33.2, we calculate \[\begin{align*} f_S(s) &= \int_{-\infty}^\infty f_X(x) f_Y(s-x) \, dx \\ &= \int_{-\infty}^\infty \left( \frac{1}{\sqrt{2\pi}} \exp\left\{ -\frac{x^2}{2} \right\} \right) \left( \frac{1}{\sqrt{2\pi}} \exp \left\{ -\frac{(s-x)^2}{2} \right\} \right) \, dx \\ &= \int_{-\infty}^\infty \frac{1}{2\pi} \exp \left\{ -\frac{x^2}{2} - \frac{(s-x)^2}{2} \right\} \, dx \\ &= \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left\{ - \frac{2x^2 - 2sx + s^2}{2} \right\} \, dx \\ &= \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left\{ - \frac{1}{2} \left( \left( \sqrt{2}x - \frac{s}{\sqrt{2}} \right)^2 + \frac{s^2}{2} \right) \right\} \, dx \\ &= \frac{1}{2\pi} \exp \left\{ -\frac{s^2}{4} \right\} \int_{-\infty}^\infty \exp \left\{ -\frac{1}{2} \left( \sqrt{2}x - \frac{s}{\sqrt{2}} \right)^2 \right\} \, dx \\ &= \frac{1}{2\pi} \exp \left\{ -\frac{s^2}{4} \right\} \int_{-\infty}^\infty \exp\left\{ -\frac{y^2}{2} \right\} \, \frac{dy}{\sqrt{2}} \\ &= \frac{1}{2 \sqrt{\pi}} \exp \left\{ -\frac{s^2}{4} \right\} \int_{-\infty}^\infty \underbrace{\frac{1}{\sqrt{2\pi}} \exp\left\{ -\frac{y^2}{2} \right\}}_{\text{PDF of Normal(0,1)}} \, dy \\ &= \frac{1}{2\sqrt{\pi}} \exp \left\{ -\frac{s^2}{4} \right\} \\ &= \frac{1}{\sqrt{2\pi} \cdot \sqrt{2}} \exp \left\{ -\frac{s^2}{2 \cdot 2} \right\}, \end{align*}\] which is the PDF of \(\textrm{Normal}(\mu= 0, \sigma^2= 2)\). Hence, \[ S = X + Y \sim \textrm{Normal}(\mu= 0, \sigma^2= 2). \]

The calculation in Example 33.3 was already quite formidable, and we have not even begun to consider adding general normal random variables! Clearly, a simpler approach is needed. We will present a simpler approach in Chapter 34 and revisit this example in Example 34.6.

When applying Theorem 33.2, we often need to be mindful of the support of the distributions, as the next example illustrates.

Example 33.4 (Convolution of independent uniforms) Let \(X\) and \(Y\) be independent \(\textrm{Uniform}(a= 0, b= 1)\). What is the distribution of \(S = X + Y\)?

By Theorem 33.2, \[ f_S(s) = \int_{-\infty}^\infty f_X(x) f_Y(s-x) \, dx. \] Since both \(f_X(x)\) and \(f_Y(s-x)\) take values 0 or 1, it follows that the product will also be 0 or 1. In fact, the integrand will be 1 only if both \(f_X(x)\) and \(f_Y(s-x)\) are 1. Depending on the value of \(s\), the two PDFs intersect in different ways. In order to make the visualization easier, we start with the following representations of \(\color{blue}f_X(x)\) and \(\color{red}f_Y(s-x)\):

Figure 33.1

We can see that the supports of \(\color{blue}f_X(x)\) and \(\color{red}f_Y(s-x)\) do not intersect if \(s < 0\):

Figure 33.2

or if \(s > 2\):

Figure 33.3

Hence, if \(s < 0\) or \(s > 2\), then \[ f_X(x) f_Y(s-x) = 0, \] and so, \(f_S(s) = 0\). Now, there are two ways for the supports to overlap.

If \(0 < s < 1\), the supports overlap in the following way:

Figure 33.4

The supports overlap between \(0\) and \(s\), and thus, \[ f_S(s) = \int_{-\infty}^\infty f_X(x) f_Y(s-x) \, dx = \int_0^s \, dx = s \] for \(0 < s < 1\).

The last type of intersection to consider happens when \(1 < s < 2\):

Figure 33.5

The supports overlap between \(s-1\) and \(1\), and so, \[ f_S(s) = \int_{-\infty}^\infty f_X(x) f_Y(s-x) \, dx = \int_{s-1}^1 \, dx = 2-s \] for \(1 < s < 2\).

Grouping everything together, we have \[ f_S(s) = \begin{cases} s, & 0 < s < 1 \\ 2-s, & 1< s < 2 \\ 0, & \text{otherwise} \end{cases}. \]

Figure 33.6: Distribution of a sum of two independent \(\textrm{Uniform}(a= 0, b= 1)\) random variables.

In the last example, we demonstrate how to use convolutions on random variables with possibly different distributions.

Example 33.5 (Sum of independent exponentials) Let \(X\) and \(Y\) be independent \(\text{Exponential}(\lambda)\). We first calculate the density of \(S = X+Y\).

By Theorem 33.2, for \(s > 0\), \[\begin{align*} f_S(s) &= \int_{-\infty}^\infty f_X(x) f_Y(s-x) \, dx \\ &= \int_0^s f_X(x) f_Y(s-x) \, dx & (\text{$f_S(x) = 0$ for $x < 0$ and $f_Y(s-x) = 0$ for $x > s$}) \\ &= \int_0^s \left( \lambda e^{-\lambda x} \right) \left( \lambda e^{-\lambda (s-x)} \right) \, dx \\ &= \int_0^s \lambda^2 e^{-\lambda s} \, dx \\ &= \lambda^2 s e^{-\lambda s}. \end{align*}\]

What happens if we add a third exponential? Let \(W \sim \text{Exponential}(\lambda)\) be independent of \(X\) and \(Y\) (and thus of \(S\)). What is the density of \(T = X+Y+W = S+W\)?

Again, by Theorem 33.2, for \(t > 0\), \[\begin{align*} f_T(t) &= \int_{-\infty}^\infty f_S(s) f_W(t-s) \, ds \\ &= \int_0^t f_S(s) f_W(t-s) \, ds & (\text{$f_S(s) = 0$ for $s < 0$ and $f_W(t - s) = 0$ for $s > t$}) \\ &= \int_0^t \left( \lambda^2 s e^{-\lambda s} \right) \left( \lambda e^{-\lambda (t - s)} \right) \, ds \\ &= \int_0^t \lambda^3 s e^{-\lambda t} \, ds \\ &= \lambda^3 e^{-\lambda t} \int_0^t s \, ds \\ &= \frac{\lambda^3}{2} t^2 e^{-\lambda t}. \end{align*}\]

33.3 Exercises

Exercise 33.1 (Distribution of a difference) Let \(X\) and \(Y\) be independent continuous random variables. Show that \[ f_{X-Y}(z) = \int_{-\infty}^\infty f_X(x) f_Y(x-z) \, dx. \]

Exercise 33.2 (Binomial convolution) Let \(X \sim \text{Binomial}(n,p)\) and \(Y \sim \text{Binomial}(m,p)\) be independent. Use convolution to determine the distribution of \(S = X+Y\).

Explain why the result makes sense intuitively.

Exercise 33.3 (Negative binomial convolution) Let \(X \sim \text{NegativeBinomial}(r,p)\) and \(Y \sim \text{NegativeBinomial}(s,p)\) be independent. Use convolution to determine the distribution of \(S = X+Y\).

Explain why the result makes sense intuitively.

Exercise 33.4 (Sum of i.i.d. uniforms) Let \(X, Y, Z\) be i.i.d. \(\textrm{Uniform}(a= 0, b= 1)\). What is the distribution of \(S_3 = X + Y + Z\)?

Hint: Make use of what we derived above.

Exercise 33.5 (Sum of different uniforms) Let \(X \sim \textrm{Uniform}(a= 0, b= 1)\) and \(Y \sim \textrm{Uniform}(a= -1, b= 2)\) be independent. What is the distribution of \(S = X+Y\)?

Exercise 33.6 (Sum of exponentials by induction) Using induction, derive a general formula for the PDF of \(S_n = X_1 + \dots + X_n\), where \(X_i\) are i.i.d. \(\text{Exponential}(\lambda)\). Note that \(n = 2, 3\) were already done in Example 33.5.