Be sure to import Symbulate using the following commands.
from symbulate import *
%matplotlib inline
A continuous probability distribution is specified by a nonnegative probability density function (pdf) $f$ which can be integrated to find probabilities of events. A probability measure $P$ is specified by a density $f$ if for any event $A$
$$ P(A) = \int_A f(x)\, dx. $$A probability density function $f$ must be nonnegative and satisfy $\int_{-\infty}^\infty\, f(x)\, dx = 1$.
The density $f(x)$ at a point $x$ is not a probability. A continuous distribution assigns a probability of zero to any single value $\{x\}$, but $f(x)$ is related to the probability of being "close to" $x$:
$$ P\left(\left[x - h/2, x + h/2\right]\right) \approx hf(x) \qquad \text{for small $h$.} $$A continuous random variable $X$ defined on a probability space with probability measure $P$ has a probability density function $f_X$ if for any susbet of real numbers $A$,
$$ P(X \in A) = \int_A \, f_X(x)\, dx. $$If $X$ is a continuous random variable, then $P(X=x)=0$ for any $x$, but
$$ P\left(x - h/2 < X< x + h/2\right) \approx hf_X(x) \qquad \text{for small $h$.} $$The expected value of a continuous random variable is
$$ E(X) = \int_{-\infty}^\infty \, x\, f_X(x) \, dx $$(Continuous) Uniform distributions are used to model situations is which every value in an interval is "equally likely" in the sense of having the same density. The density of a Uniform distribution with parameters $a$ and $b$ is constant on the interval $(a, b)$ and is specified by the probability density function
$$ f(x) = \frac{1}{b-a}, \quad a < x < b. $$If $X$ is a random variable with a Uniform($a, b$) distribution then $$ \begin{align*} E(X) & = \frac{a + b}{2} \\ Var(X) & = \frac{(b - a)^2}{12} \end{align*} $$
Example. Actual and simulated pdf for Uniform(a=1, b=3)
.
sims = RV(Uniform(a=1, b=3)).sim(10000)
sims.plot()
Uniform(a=1, b=3).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(Uniform(a=1, b=3).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(Uniform(a=1, b=3).var()))
Example. Uniform(a=1, b)
pdf.
bs = [3, 5, 7]
for b in bs:
Uniform(a=1, b=b).plot()
plt.legend(["b=" + str(i) for i in bs])
plt.title("Uniform(a, b) densities with a=1")
A Normal (a.k.a. Gaussian) distribution with mean $\mu$ and standard deviation $\sigma>0$ (and variance $\sigma^2$) has the probability density function
$$ f(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp\left({-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}\right), \quad -\infty < x< \infty. $$If $X$ is a random variable with a Normal($\mu$, $\sigma$) distribution then
$$ \begin{align*} E(X) & = \mu \\ Var(X) & = \sigma^2 \end{align*} $$Example. Actual and simulated pdf for Normal(mean=1, sd=2)
.
sims = RV(Normal(mean=1, sd=2)).sim(10000)
sims.plot()
Normal(mean=1, sd=2).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(Normal(mean=1, sd=2).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(Normal(mean=1, sd=2).var()))
Example. Normal(mu, sd=1)
pdf.
mus = [-2, 0, 2]
for mu in mus:
Normal(mean=mu, sd=1).plot()
plt.legend(["mean=" + str(i) for i in mus])
plt.title("Normal densities with sigma=1")
Example. Normal(mean=0, sd)
pdf.
sigmas = [1, 2, 3]
for sigma in sigmas:
Normal(mean=0, sd=sigma).plot()
plt.legend(["sigma=" + str(i) for i in sigmas])
plt.title("Normal densities with mu=0")
A LogNormal distribution with parameters $\mu \in \mathbb{R}$ and $\sigma >0$ is specified by the probability density function
$$ f(x) = \frac{1}{x\sigma \sqrt{2\pi}} \exp\left(-\frac{1}{2}\left(\frac{\log(x)-\mu}{\sigma}\right)^2\right) $$If $X$ is a random variable with a LogNormal($\mu$, $\sigma$) distribution then
$$ \begin{align*} E(X) & = \exp(\mu + \sigma^2/2) \\ Var(X) & = \left(\exp\left(\sigma^2\right)-1\right)\exp\left(2\mu+\sigma^2\right) \end{align*} $$Example. Actual and simulated pdf for LogNormal(mu=0, sigma=1)
.
sims = RV(LogNormal(mu=0, sigma=1)).sim(10000)
sims.plot(bins=60)
LogNormal(mu=0, sigma=1).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(LogNormal(mu=0, sigma=1).mean()))
If $Y$ has a LogNormal($\mu,\,\sigma$), then $\log(Y)$ has a normal distribution with mean $\mu$ and sd $\sigma$.
Y = RV(LogNormal(mu=0, sigma=1))
log(Y).sim(10000).plot()
Normal(mean=0, sd=1).plot()
An Exponential distribution with rate parameter $\lambda>0$ is specified by the probability density function
$$ f(x) = \lambda e^{-\lambda x}, \quad x>0. $$Exponential distributions can be parametrized in terms of the rate parameter $\lambda$ (using rate=
) or the scale parameter $1/\lambda$ (using scale=
).
If $X$ is a random variable with a Exponential($\lambda$) distribution then
$$ \begin{align*} E(X) & = \frac{1}{\lambda} \\ Var(X) & = \frac{1}{\lambda^2} \end{align*} $$Example. Actual and simulated pdf for Exponential(rate=2)
.
sims = RV(Exponential(rate=2)).sim(10000)
sims.plot()
Exponential(rate=2).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(Exponential(rate=2).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(Exponential(rate=2).var()))
Example. Exponential(rate)
pdf.
rates = [4, 7, 10]
for rate in rates:
Exponential(rate).plot()
plt.legend(["lambda=" + str(i) for i in rates])
plt.title("Exponential densities with rate parameter lambda")
A Gamma distribution with shape parameter $\alpha>0$ and rate parameter $\lambda >0$ has density
$$ f(x) = \frac{\lambda^\alpha}{\Gamma(\alpha)} e^{-\lambda x} x^{\alpha -1}, \quad x>0, $$where
$$ \Gamma(\alpha) = \int_0^\infty e^{-u}u^{\alpha - 1}\, du, \quad \text{with } \Gamma(k) = (k-1)! \text{ if $k$ is a positive integer.} $$Gamma distributions can be parametrized in terms of the rate parameter $\lambda$ (using rate=
) or the scale parameter $1/\lambda$ (using scale=
).
If $X$ is a random variable with a Gamma($\alpha, \lambda$) distribution then
$$ \begin{align*} E(X) & = \frac{\alpha}{\lambda} \\ Var(X) & = \frac{\alpha}{\lambda^2} \end{align*} $$Example. Actual and simulated pdf for Gamma(shape=4, rate=2)
.
sims = RV(Gamma(shape=4,rate=2)).sim(10000)
sims.plot()
Gamma(shape=4,rate=2).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(Gamma(shape=4, rate=2).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(Gamma(shape=4, rate=2).var()))
Example. Exponential(shape=4, rate)
pdf.
rates = [4, 7, 10]
for rate in [4, 7, 10]:
Gamma(shape=4, rate=rate).plot()
plt.legend(["lambda=" + str(i) for i in rates])
plt.title("Gamma densities with shape parameter alpha=4 and rate parameter lambda")
Example. Exponential(shape, rate=2)
pdf.
shapes = [1, 2, 4]
for shape in shapes:
Gamma(shape=shape, rate=1).plot()
plt.legend(["alpha=" + str(i) for i in shapes])
plt.title("Gamma densities with shape parameter alpha and rate parameter lambda=1")
A Gamma distribution with shape parameter $\frac{d}{2}$ and rate parameter $\frac{1}{2}$ is also known as a Chi-square (ChiSquare
) distribution with $d$ degrees of freedom and has density
If $X$ is a random variable with a ChiSquare($d$) distribution then
$$ \begin{align*} E(X) & = d \\ Var(X) & = 2d \end{align*} $$Example. Actual and simulated pdf for ChiSquare(df=2)
.
sims = RV(ChiSquare(df=2)).sim(10000)
sims.plot()
ChiSquare(df=2).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(ChiSquare(df=2).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(ChiSquare(df=2).var()))
Example. ChiSquare(df)
pdf.
ds = [2, 3, 10]
for df in ds:
ChiSquare(df).plot()
plt.legend(["d=" + str(i) for i in ds])
plt.title("Chi-square densities with d degrees of freedom")
A (Student's) t (StudentT
) distribution with d degrees of freedom has the probability density function
If $X$ is a random variable with a StudentT($d$) distribution then $$ \begin{align*} E(X) & = 0, & & \quad \text{if $d>1$,}\\ Var(X) & = \frac{d}{d - 2}, & & \quad \text{if $d>2$.} \end{align*} $$
Example. Actual and simulated pdf for StudentT(df=5)
.
sims = RV(StudentT(df=5)).sim(10000)
sims.plot(bins=60)
StudentT(df=5).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(StudentT(df=5).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(StudentT(df=5).var()))
Example. StudentT(df)
pdf.
ds = [2, 5, 20]
for d in ds:
StudentT(df=d).plot()
plt.legend(["d=" + str(i) for i in ds])
plt.title("Student's t densities with d degrees of freedom")
A StudentT
distribution is more conveniently defined via the following:
If Z has a Normal(0,1) distribution, Y has a Chi-square distribution with $d$ degrees of freedom, and Y and Z are independent, then the random variable
$$\frac{Z}{ \sqrt{Y/d} }$$has a Student's t distribution with d degrees of freedom.
d = 10
Y, Z = RV(ChiSquare(d) * Normal(0, 1))
(Z / sqrt(Y / d)).sim(10000).plot(bins=60)
StudentT(d).plot()
The Student's t distribution with 1 degree of freedom is also known as the standard Cauchy distribution. General Cauchy distributions can be obtained through location-scale transformations of the Student's t distribution with 1 degree of freedom.
The Cauchy distribution with location parameter $x_0$ and scale parameter $\gamma$ has the probability density function
$$ f(x) = \frac{1}{\pi\gamma} \frac{1}{1 + \left(\frac{x - x_0}{\gamma}\right)^2} $$If $X$ is a random variable with a Cauchy($x_0,\,\gamma$) distribution then $$ \begin{align*} E(X) & \text{ is undefined} \\ Var(X) & \text{ is undefined} \end{align*} $$
Example. Actual and simulated pdf for Cauchy(loc=0, scale=1)
.
sims = RV(Cauchy(loc=0, scale=1)).sim(1000)
sims.plot()
Cauchy(loc=0, scale=1).plot()
Because the expected value and variance do not exist, the sample mean and sample variance are not consistent estimators of anything. (The law of large numbers does not hold for distributions for which the mean is undefined.)
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(Cauchy(loc = 0, scale = 1).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(Cauchy(loc = 0, scale = 1).var()))
Example. Cauchy(loc=0, scale)
pdf.
scales = [1, 1.5, 2]
for scale in scales:
Cauchy(loc=0, scale=scale).plot()
plt.legend(["scale=" + str(i) for i in scales])
plt.title("Cauchy densities with location parameter 0")
Example. Cauchy(loc, scale=1)
pdf.
locs = [-10, 0, 10]
for loc in locs:
Cauchy(loc=loc, scale=1).plot()
plt.legend(["loc=" + str(i) for i in locs])
plt.title("Cauchy densities with scale parameter 1")
An F distribution with $d_1$ degrees of freedom in the numerator and $d_2$ degrees of freedom in the denominator has the probability density function
$$ f(x) = \frac{\Gamma\left(\left(d_1 + d_2\right)/2\right)d_1^{d_1/2}d_2^{d_2/2}}{\Gamma\left(d_1/2\right)\Gamma\left(d_2/2\right)} \frac{x^{d_1/2 - 1}}{\left(d_1 x + d_2\right)^{\left(d_1 + d_2\right)/2}}, \quad x > 0 $$If $X$ is a random variable with a $F(d_1, d_2)$ distribution then
$$ \begin{align*} E(X) & = \frac{d_2}{d_2 - 2} \qquad \text{for} \, d_2> 2 \\ Var(X) & = \frac{2d_2^2(d_1 + d_2 - 2)}{d_1(d_2-2)^2(d_2-4)} \qquad \text{for} \, d_2 > 4 \end{align*} $$Example. Actual and simulated pdf for F(dfN=5, dfD=10)
.
sims = RV(F(dfN=5, dfD=10)).sim(10000)
sims.plot(bins=60)
F(dfN=5, dfD=10).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(F(dfN=5, dfD=10).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(F(dfN=5, dfD=10).var()))
Example. F(dfN, dfD=20)
pdf.
dfNs = [10, 20, 30]
for dfN in dfNs:
F(dfN=dfN, dfD=20).plot()
plt.legend(["d1=" + str(i) for i in dfNs])
plt.title("F densities with 20 degrees of freedom in the denominator")
Example. F(dfN = 50, dfD)
pdf.
dfDs = [10, 25, 50]
for dfD in dfDs:
F(dfN=50, dfD=dfD).plot()
plt.legend(["d2=" + str(i) for i in dfDs])
plt.title("F densities with 50 degrees of freedom in the numerator")
An F distribution is more conviently defined via the following:
If $Y_1$ has a chi-square distribution with m degrees of freedom, $Y_2$ has a chi-square distribution with n degress of freedom, and $Y_1$ and $Y_2$ are independent, then the random variable
$$\frac{Y_1/m}{Y_2/n}$$has an F distribution with m degrees of freedom in the numerator and n degrees of freedom in the denominator.
Beta distributions are probability distributions on bounded intervals. For a Uniform(0, 1) distribution, the probability density is constant on the interval (0, 1). Beta distributions are probability distributions on the interval (0, 1) where the density is not necessarily constant. A Beta distribution with parameters $\alpha>0$ and $\beta>0$ has the probability density function
$$ f(x) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1} (1-x)^{\beta-1}, \qquad 0 < x < 1, $$where
$$ \Gamma(\alpha) = \int_0^\infty e^{-u}u^{\alpha - 1}\, du, \quad \text{with } \Gamma(k) = (k-1)! \text{ if $k$ is a positive integer} $$Note that the Beta distribution with $\alpha = \beta=1$ is the Uniform(0, 1) distribution.
If $X$ is a random variable with a Beta($\alpha, \beta$) distribution then $$ \begin{align*} E(X) & = \frac{\alpha}{\alpha + \beta} \\ Var(X) & = \frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)} \end{align*} $$
Example. Actual and simulated pdf for Beta(a=2, b=3)
.
sims = RV(Beta(a=2, b=3)).sim(10000)
sims.plot()
Beta(a=2, b=3).plot()
print('The simulated mean is: {:.2f}'.format(sims.mean()))
print('The true mean is: {:.2f}'.format(Beta(a=2, b=3).mean()))
print('The simulated variance is: {:.2f}'.format(sims.var()))
print('The true variance is: {:.2f}'.format(Beta(a=2, b=3).var()))
Example. Beta(a, b=1)
pdf.
alphas = [0.5, 2, 4]
for a in alphas:
Beta(a, b=1).plot()
plt.legend(["alpha=" + str(i) for i in alphas])
plt.title("Beta densities with parameter beta=1")
Example. Beta(a=2, b)
pdf.
betas = [1, 2, 4]
for b in betas:
Beta(a=2, b=b).plot()
plt.legend(["beta=" + str(i) for i in betas])
plt.title("Beta densities with parameter alpha=2")
Beta distributions can be generalized to intervals $(c, d)$ by rescaling. If $X$ has a Beta distribution then $c + (d-c) X$ will have a distribution with the same shape on the interval $(c, d)$.
X = RV(Beta(4, 2))
c = 5
d = 15
(c + (d - c) * X).sim(10000).plot()