A random variable \(X\) has a Gaussian (normal) distribution with mean \(\mu\) and variance \(\sigma^2\), written \(X \sim N(\mu,\sigma^2)\), if \(X\) has density
\[\begin{equation*} f(x)=\frac{1}{\sqrt{2\pi}\sigma}\exp\left\{-\frac{1}{2 \sigma^2}(x-\mu)^2\right\}. \end{equation*}\]When \(\mu=0\) and \(\sigma=1\), then we say \(X \sim N(0,1)\) has a standard normal distribution.
If \(X \sim N(\mu,\sigma^2)\), then \(\frac{x-\mu}{\sigma} \sim N(0,1)\). For the standard normal distribution, 95% of the probability mass falls between -1.96 and 1.96 (roughly 2 standard deviations of the mean). Much of hypothesis testing is based on this fact.
The normal distribution is symmetric about its mean.
A linear transformation of a multivariate normal distribution yields another multivariate normal distribution. Suppose \(X \sim N_n(\mu,\Sigma)\). For \(A_{r \times n}\) a matrix of constants and \(b_{r \times 1}\) a vector of constants, then \(Y=AX+b\) has the multivariate normal distribution given by \(Y \sim N_r(A \mu+b,A \Sigma A')\).
A linear combination of independent multivariate normal distributions is a multivariate normal distribution. Suppose \(X_1, \ldots, X_k\) are independent with \(X_i \sim N_n(\mu_i,\Sigma_i)\), \(i=1,\ldots,k\). Suppose \(a_1, \ldots, a_k\) are scalars and define \[Y=a_1 X _1 + \ldots + a_k X_k.\] Then \(Y \sim N(\mu^*,\Sigma^*)\), where \(\mu^*=\sum_{i=1}^k a_i \mu_i\) and \(\Sigma^*=\sum_{i=1}^k a_i^2 \Sigma_i\).
We often make assumptions in hierarchical models that certain quantities follow normal distributions – these could be univariate or multivariate responses, random effects, or prior parameters. Sometimes we might assume certain quantities follow conditional normal distributions as well. Stay tuned for more details!
where \(\Gamma(a)\) is the complete gamma function, given by \(\Gamma(a)=\int_0^\infty x^{a-1} e^{-x} dx\). The chi-squared distribution is asymmetric and restricted to positive numbers. Its degrees of freedom determine the mean and variance of the distribution.
The chi-squared distribution is related to the normal distribution. If the random variable \(Z \sim N(0,1)\), then \(Z^2 \sim \chi^2(1)\). In addition, if \(Z_1, Z_2, \ldots, Z_n\) are independent, identically distributed \(N(0,1)\) random variables, then \(W=\sum_{i=1}^n Z_i^2\) has a chi-squared distribution with \(n\) degrees of freedom; that is, \(W \sim \chi^2(n)\). The mean of a \(\chi^2(n)\) distribution is \(n\), and its variance is \(2n\).
The chi-squared distribution comes up a lot in testing using frequentist hierarchical and multilevel models, and mixtures of chi-squared distributions are important in testing variance components in frequentist multilevel models.
This distribution has a great story!
has a \(t\) distribution with \(n\) degrees of freedom. We write this as \(T \sim t(n)\). Like the standard normal, the t distribution is symmetric about 0.
The degrees of freedom \(n\) determines the amount of variability in the \(t\) distribution. As the number of degrees of freedom increases, the variability of the \(t\) distribution decreases. In fact, as the number of degrees of freedom gets large, the \(t\) distribution approximates the standard normal distribution. With smaller degrees of freedom, the t distribution resembles a normal distribution with fatter tails.
A \(t(1)\) distribution, which has 1 degree of freedom, is not well-behaved and is called a Cauchy distribution.
Ahh, we see the t-distribution a lot – it’s a standard distribution in frequentist testing when we use 1 df tests in linear models and ANOVA.
In case you’re getting tired and feeling demoralized by this review of distributions, you might want to take a break and read about a big mistake in Fisher’s understanding. Even a brilliant scientist can make huge mistakes, so take heart!
has a central \(F\) distribution with \((n_1,n_2)\) degrees of freedom. We write this \(F \sim F(n_1,n_2)\).
We call \(n_1\) the numerator degrees of freedom and \(n_2\) the denominator degrees of freedom. If \(T \sim t(\nu)\), then \(T^2 \sim F_{1,\nu}\). The \(F\) distribution is asymmetric and restricted to positive numbers.
The F distribution is the workhorse of testing hypotheses about the mean in linear models and ANOVA.