In Chapter 8, we defined random variables and described them by their PMFs. However, we are often interested in more complicated random variables that are functions of simpler random variables. In this chapter, we describe how random variables behave under transformations. That is, if \(X\) is a random variable, what is the distribution of \(Y = g(X)\)?
10.1 Functions of a Random Variable
To build some intuition about transformations, we will revisit the random variables from Example 8.2.
Example 10.1 (Roulette and Transformations) Let \(S\) be the profit from a $1 straight-up bet on the number 23. We showed in Example 8.2 that the PMF of \(S\) is
The PMF of \(S\) is
\(x\)
\(-1\)
\(35\)
\(f_{S}(x)\)
\(37/38\)
\(1/38\)
What if \(Y\) represented the profit from a $10 straight-up bet on the same number? We can express \(Y\) as a transformation \(g(S)\); in particular, \[ Y = g(S) = 10 S. \tag{10.1}\]
The possible outcomes are \(-10\) and \(350\), so the PMF of \(Y\) is
\(x\)
\(-10\)
\(350\)
\(f_{Y}(x)\)
\(37/38\)
\(1/38\)
In fact, \(S\) itself can be written as a transformation of a simpler random variable \(I\), which is the indicator of the ball landing on the number 23. That is, \(I\) is a \(\text{Bernoulli}(p=1/38)\) random variable (Definition 8.5), with PMF
\(x\)
\(0\)
\(1\)
\(f_{I}(x)\)
\(37/38\)
\(1/38\)
Convince yourself that \[ S = 36 I - 1 \tag{10.2}\] by plugging in the possible values of \(I\).
Notice that we can do algebra with random variables. For example, we can derive a formula for \(Y\) as a transformation of \(I\): \[ Y = 10S = 10(36 I - 1) = 360 I - 10. \]
The next example illustrates how to use algebra and reasoning to derive the PMF of a transformed random variable.
Example 10.2 (Transformation of a binomial) Suppose \(X\) is \(\text{Binomial}(n,p)\) random variable. Define a new random variable \(Y\) by \(Y = n - X\). In other words, \(Y = g(X)\), where \(g(x) = n - x\).
By doing algebra on the random variables, we can see that the PMFs of \(X\) and \(Y\) are related by
Substituting \(n - k\) into the binomial PMF, we obtain: \[
f_Y(k) = f_X(n-k) = \binom{n}{n-k} p^{n-k} (1-p)^{k} = \binom{n}{k} (1-p)^k (1 - (1-p))^{n-k},
\] which is the PMF of a binomial distribution with parameters \(n\) and \(1-p\)!
We can also reach the same conclusion by interpreting the random variables. \(X\) counts the number of successes in \(n\) independent trials, each with success probability \(p\). Then, \(Y = n-X\) counts the number of failures. Each failure happens with probability \(1-p\) and the \(n\) trials are still independent, so \(Y \sim \text{Binomial}(n,1-p)\).
In Example 10.1 and Example 10.2, only the possible values \(x\) changed, while the probabilities stayed the same. It turns out that this is not always the case, as the next example shows.
Example 10.3 (Random walk)
A drunk man staggers out of a bar. Each step he takes is equally likely to be either 1 foot to the left or 1 foot to the right, independently of the other steps. How far is he from the bar entrance after 5 steps?
Let \(L\) be the number of steps he takes to the left. Then \(L \sim \text{Binomial}(n=5, p=1/2)\). The number of steps to the right, per Example 10.2, is \(R = 5 - L\), which is also \(\text{Binomial}(n=5, p=1/2)\). The PMF of \(L\) (and \(R\)) is \[ f_L(x) = {5 \choose x} \left( \frac{1}{2} \right)^x \left( \frac{1}{2} \right)^{5-x} = \frac{5 \choose x}{32},\] which can be written in tabular form as
\(x\)
\(0\)
\(1\)
\(2\)
\(3\)
\(4\)
\(5\)
\(f_{L}(x)\)
\(1/32\)
\(5/32\)
\(10/32\)
\(10/32\)
\(5/32\)
\(1/32\)
Let \(X\) be his position after 5 steps, relative to the bar entrance. Then, \[ X = R - L = (5 - L) - L = 5 - 2L. \tag{10.3}\]
Since \(L\) is an integer from \(0\) to \(5\), \(X\) is an odd integer from \(-5\) to \(5\). The PMF of \(X\) has the same probabilities but distributed over different values.
\(x\)
\(-5\)
\(-3\)
\(-1\)
\(1\)
\(3\)
\(5\)
\(f_{X}(x)\)
\(1/32\)
\(5/32\)
\(10/32\)
\(10/32\)
\(5/32\)
\(1/32\)
We can write it as the following formula \[
\begin{align}
f_X(x) = P(X = x) &= P(5 - 2L = x) \\
&= P(L = (5-x)/2) \\
&= f_L((5-x)/2) \\
&= \frac{{5 \choose (5-x)/2}}{32},
\end{align}
\] as long as we are careful to only evaluate this function at \(x = -5, -3, -1, 1, 3, 5\).
Let \(D\) be his distance from the bar entrance. Notice that this distance can be expressed as a transformation of \(X\) (or \(L\)). \[
D = |X| = |5 - 2L|.
\tag{10.4}\]
The possible values of \(D\) are \(1\), \(3\), and \(5\). Each of these values corresponds to two values of \(X\). To calculate the PMF of \(D\), we will need to add these probabilities:
Notice that these probabilities are different from the probabilities in the PMF of \(X\) because multiple values of \(X\) mapped to the same value of \(D\). In mathematical terms, the function \(g(x) = |x|\) is not one-to-one.
When the transformation \(Y = g(X)\) is not one-to-one, then the PMFs of \(X\) and \(Y\) may not have the same probabilities. But it is straightforward to determine the PMF of \(Y\) by the following recipe:
Identify the possible values of \(Y\), say \(y_1, y_2, \dots\).
For each possible value \(y_i\), identify all the \(x\) values such that \(g(x) = y_i\) and sum their probabilities:
The next example illustrates how to apply this recipe.
Example 10.4 (Call option) In finance, a call option is a contract that allows the holder to buy a certain share at a pre-determined price (called the “strike price”) at a pre-determined time.
For example, suppose you hold a call option that allows you to buy a stock at a price of $55 in one week. The price of the stock next week is a random variable \(X\).
If \(X = 53\), then the option is worth nothing because there is no point in exercising the option for $55 when you could just buy the stock for $53.
But if \(X = 60\), then the option is worth $5 because \(60 - 55 = 5\). (In other words, you could make $5 by buying the stock for $55 and immediately selling it for $60.)
In other words, the value of this option \(Y\) is a transformation of the stock price next week \(X\). In particular, \[
Y = \max(X - 55, 0).
\] This function is not one-to-one because any value of \(X\) below 55 will map to \(Y = 0\).
Suppose the PMF of \(X\) is
\(x\)
\(50\)
\(53\)
\(57\)
\(60\)
\(f_X(x)\)
\(1/8\)
\(2/8\)
\(4/8\)
\(1/8\)
The possible values of \(Y\) are \(0\), \(2\), and \(5\). To determine \(f_Y(0)\), we need to sum the probabilities of the values of \(X\) that map to \(Y = 0\):
\(f_Y(0) = f_X(50) + f_X(53) = 1/8 + 2/8 = 3/8\).
The other two probabilities are more straightforward:
\(f_Y(2) = f_X(57) = 4/8\).
\(f_Y(5) = f_X(60) = 1/8\).
To summarize, the PMF of \(Y\) is
\(y\)
\(0\)
\(2\)
\(5\)
\(f_Y(y)\)
\(3/8\)
\(4/8\)
\(1/8\)
10.2 Functions of Multiple Variables
A random variable can also be a transformation of more than one random variable.
Example 10.5 (Higher of two dice rolls) Consider a game where two dice are rolled and the higher of the two numbers is taken. If the higher number is 1, 2, 3, or 4, then Player 1 wins; if the higher number is 5 or 6, then Player 2 wins. Would you rather be Player 1 or Player 2?
At first, it might seem like Player 1 has the upper hand, as there are four outcomes where they win, compared with two outcomes where Player 2 wins. The problem with this reasoning is that the outcomes are not equally likely.
We can answer this question by defining a random variable \(H\) to be the higher number on the two dice. If we denote by \(X\) and \(Y\) the numbers on the first and second die, respectively, then \(H\) is a function of both \(X\) and \(Y\): \[
H = \max(X, Y).
\tag{10.6}\]
We can determine the PMF of \(H\) by listing the 36 possible combinations of \(X\) and \(Y\), along with the corresponding value of \(H\).
\(X\)
\(Y\)
\(H = \max(X, Y)\)
\(1\)
\(1\)
\(1\)
\(1\)
\(2\)
\(2\)
\(1\)
\(3\)
\(3\)
\(1\)
\(4\)
\(4\)
\(1\)
\(5\)
\(5\)
\(\dots\)
\(\dots\)
\(\dots\)
\(6\)
\(2\)
\(6\)
\(6\)
\(3\)
\(6\)
\(6\)
\(4\)
\(6\)
\(6\)
\(5\)
\(6\)
\(6\)
\(6\)
\(6\)
We can obtain the PMF of \(H\) by counting the results.
\(x\)
\(1\)
\(2\)
\(3\)
\(4\)
\(5\)
\(6\)
\(f_H(x)\)
\(1/36\)
\(3/36\)
\(5/36\)
\(7/36\)
\(9/36\)
\(11/36\)
Using this PMF, we can calculate the probability that Player 1 wins as \[ P(H \leq 4) = 1/36 + 3/36 + 5/36 + 7/36 = 16/36, \] which is less than \(1/2\). It is better to be Player 2!
This example is also explained in the following video.
There are also cases where we can heuristically reason about the distribution of a transformation of multiple random variables.
Example 10.6 (Sum of independent Bernoullis) Let \(X_1, X_2, \dots, X_n\) be \(n\) independent \(\text{Bernoulli}(p)\) random variables. What is the distribution of \(Y = X_1 + X_2 + \cdots + X_n\)?
In Section 8.4, we likened \(X_i\) to tossing a coin with heads probability \(p\); \(X_i\) counts the number of heads in the toss.
In this situtation, we can think of tossing the coin \(n\) times, and \(X_i\) is the indicator of the \(i\)th toss coming up heads.
Then, \(Y\) is the number of heads in the \(n\) tosses, and thus, \(Y \sim \text{Binomial}(n,p)\).
However, a full discussion of transformations of multiple random variables will have to wait until we discuss how to describe the distribution of multiple random variables in Chapter 13.