Consider the linear regression model:
where \(\mathbf{y}\) is an \(n \times 1\) vector of observations, \(\mathbf{X}\) is an \(n \times p\) matrix of predictors, \(\boldsymbol{\beta}\) is a \(p \times 1\) vector of coefficients, and \(\boldsymbol{\varepsilon}\) is an \(n \times 1\) vector of errors.
Assume \(\boldsymbol{\varepsilon} \sim \mathcal{N}(\mathbf{0}, \sigma^2\mathbf{I})\).
The OLS estimator is given by:
Substituting the model equation:
The term \((\mathbf{X}'\mathbf{X})^{-1}\mathbf{X}'\boldsymbol{\varepsilon}\) is a linear combination of the elements of \(\boldsymbol{\varepsilon}\), which are normally distributed. By the properties of multivariate normal distributions, any linear combination of normally distributed variables is also normally distributed.
We can characterize this distribution:
Thus, \((\mathbf{X}'\mathbf{X})^{-1}\mathbf{X}'\boldsymbol{\varepsilon} \sim \mathcal{N}(\mathbf{0}, \sigma^2(\mathbf{X}'\mathbf{X})^{-1})\)
Therefore, \(\hat{\boldsymbol{\beta}}\) is the sum of a constant vector \(\boldsymbol{\beta}\) and a normally distributed random vector. By the properties of normal distributions, this results in a normally distributed random vector with a shifted mean.
Conclusion: The OLS estimator \(\hat{\boldsymbol{\beta}}\) follows a multivariate normal distribution:
This result shows that the OLS estimator \(\hat{\boldsymbol{\beta}}\) is normally distributed around the true parameter \(\boldsymbol{\beta}\), with a variance-covariance matrix \(\sigma^2(\mathbf{X}'\mathbf{X})^{-1}\), given that the errors are normally distributed.