10. Random variables#
Recommended reference: Wasserman [Was04], Sections 1.1–1.3, 2.1–2.2 and 3.1–3.3.
10.1. Introduction#
A probability distribution assigns probabilities (real numbers in \([0,1]\)) to elements or subsets of a sample space. The elements of a sample space are called outcomes, subsets are called events.
The space of outcomes is usually of one of two kinds:
some finite or countable set (modelling the number of particles hitting a detector, for example), or
the real line, a higher-dimensional space or some subset of one of these (modelling the position of a particle, for example).
These correspond to the two types of probability distributions that are usually distinguished: discrete and continuous probability distributions.
10.2. Random variables#
A random variable is any real-valued function on the space of outcomes of a probability distribution. Random variables can often be interpreted as observable (scalar) quantities such as length, position, energy, the number of occurrences of some event or the spin of an elementary particle.
Any random variable has a distribution function, which is how we usually describe random variables. We will look at distribution functions separately for discrete and continuous random variables.
10.3. Discrete random variables#
A discrete random variable \(X\) can take finitely many or countably many distinct values \(x_0,x_1,x_2,\ldots\) in \(\RR\). It is characterised by its probability mass function (PMF).
Definition 10.1 (Probability mass function)
The probability mass function of the discrete random variable \(X\) is defined by
One can visualise a probability mass function by placing a vertical bar of height \(f_X(x)\) at each value \(x\), as in the figure below.
Show code cell source
from matplotlib import pyplot
from myst_nb import glue
fig, ax = pyplot.subplots()
for x, px in ((-0.2, 0.3), (0.4, 0.2), (0.7, 0.5)):
ax.add_line(pyplot.Line2D((x, x), (0, px), linewidth=2))
ax.set_xbound(-0.3, 0.8)
ax.set_ybound(0, 0.6)
ax.set_xlabel('$x$')
ax.set_ylabel('$f(x)$')
glue("pmf", fig)
![_images/c7fc7ad649aff3efacf36dc010633e04fbda01fb3ec10b2a6851549436f02d45.png](_images/c7fc7ad649aff3efacf36dc010633e04fbda01fb3ec10b2a6851549436f02d45.png)
Fig. 10.1 Probability mass function of a discrete random variable.#
Property 10.1 (Properties of a probability mass function)
Since the \(f_X(x)\) are probabilities, they satisfy
Since the \(x_i\) are all the possible values and their total probability equals 1, we also have
10.4. Continuous random variables#
For a continuous random variable \(X\), the set of possible values is usually a (finite or infinite) interval, and the probability of any single value occurring is usually zero. We therefore consider the probability of the value lying in some interval. This can be described by a probability density function (PDF).
Definition 10.2 (Probability density function)
The probability density function of the continuous random variable \(X\) is a function \(f_X\colon\RR\to\RR\) such that
To visualise a continuous random variable, one often plots the probability density function, as in the figure below.
Property 10.2 (Properties of a probability density function)
\(f_X(x)\ge0\) for all \(x\);
\(\int_{-\infty}^\infty f(x)dx=1\).
Show code cell source
from matplotlib import pyplot
from myst_nb import glue
import numpy as np
x = np.linspace(0, 16, 101)
fx = 1/120 * x**5 * np.exp(-x)
fig, ax = pyplot.subplots()
ax.plot(x, fx)
ax.set_xbound(0, 16)
ax.set_ybound(0, 0.2)
ax.set_xlabel('$x$')
ax.set_ylabel('$f(x)$')
glue("pdf", fig)
![_images/f48cd4e479aac3149ce6cc1eaf71a4c09ff1e35154ba9343b91a19af0d9f26bc.png](_images/f48cd4e479aac3149ce6cc1eaf71a4c09ff1e35154ba9343b91a19af0d9f26bc.png)
Fig. 10.2 Probability density function of a continuous random variable.#
10.5. Expectation and variance#
The expectation or (mean) of a random variable is the average value of many samples. The variance and the related standard deviation measure by how much samples tend to deviate from the average.
Definition 10.3 (Expectation)
The expectation or mean of a discrete random variable \(X\) with probability mass function \(f_X\) is
The expectation or mean of a continuous random variable \(X\) with probability density function \(f_X\) is
The expectation of \(X\) is often denoted by \(\mu\) or \(\mu(X)\).
Definition 10.4 (Variance and standard deviation)
The variance of a (discrete or continuous) random variable \(X\) with mean \(\mu\) is
The standard deviation of \(X\) is
It is not hard to show (see Exercise 10.3) that
10.6. Exercises#
Exercise 10.1
Show that the function
is a probability mass function.
Exercise 10.2
Which of the following functions are probability density functions?
\(f(x)=\begin{cases} 0& \text{if }x<0\\ x\exp(-x)& \text{if }x\ge 0\end{cases}\)
\(f(x)=\begin{cases} 1/4& \text{if }-2\le x\le 2\\ 0& \text{otherwise}\end{cases}\)
\(f(x)=\begin{cases} \frac{3}{4}(x^2-1)& \text{if }-2\le x\le 2\\ 0& \text{otherwise} \end{cases}\)
Exercise 10.3
Deduce (10.1) from the definition of the variance.
Exercise 10.4
Consider a continuous random variable \(X\) with probability density function
Compute the expectation and the variance of \(X\).