Detailed Derivation of Likelihood Function in Linear Regression

Starting Point: Linear Regression Model

We begin with the linear regression model:

\[ \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon} \]

where:

Assumptions

We assume that the errors are normally distributed:

\[ \boldsymbol{\varepsilon} \sim \mathcal{N}(\mathbf{0}, \sigma^2\mathbf{I}) \]

This means each \(\varepsilon_i\) is independently and identically distributed as \(\mathcal{N}(0, \sigma^2)\).

Deriving the Likelihood Function

1. Given \(\mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}\), we can write:

\[ \boldsymbol{\varepsilon} = \mathbf{y} - \mathbf{X}\boldsymbol{\beta} \]

2. Since \(\boldsymbol{\varepsilon} \sim \mathcal{N}(\mathbf{0}, \sigma^2\mathbf{I})\), we know that \(\mathbf{y}\) given \(\boldsymbol{\beta}\) and \(\mathbf{X}\) follows a multivariate normal distribution:

\[ \mathbf{y} | \boldsymbol{\beta}, \mathbf{X} \sim \mathcal{N}(\mathbf{X}\boldsymbol{\beta}, \sigma^2\mathbf{I}) \]

3. The probability density function of a multivariate normal distribution \(\mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma})\) is given by:

\[ f(\mathbf{x}) = (2\pi)^{-n/2} |\boldsymbol{\Sigma}|^{-1/2} \exp\left(-\frac{1}{2}(\mathbf{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{x} - \boldsymbol{\mu})\right) \]

4. In our case:

5. Note that \(|\sigma^2\mathbf{I}| = (\sigma^2)^n\) and \((\sigma^2\mathbf{I})^{-1} = \frac{1}{\sigma^2}\mathbf{I}\)

6. Substituting these into the multivariate normal PDF:

\[ \begin{aligned} P(\mathbf{y}|\boldsymbol{\beta}, \mathbf{X}) &= (2\pi)^{-n/2} |\sigma^2\mathbf{I}|^{-1/2} \exp\left(-\frac{1}{2}(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})^T (\sigma^2\mathbf{I})^{-1} (\mathbf{y} - \mathbf{X}\boldsymbol{\beta})\right) \\[10pt] &= (2\pi)^{-n/2} (\sigma^2)^{-n/2} \exp\left(-\frac{1}{2\sigma^2}(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})^T (\mathbf{y} - \mathbf{X}\boldsymbol{\beta})\right) \\[10pt] &= (2\pi\sigma^2)^{-n/2} \exp\left(-\frac{1}{2\sigma^2}(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})^T(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})\right) \end{aligned} \]

Final Result

Thus, we arrive at the likelihood function:

\[ P(\mathbf{y}|\boldsymbol{\beta}, \mathbf{X}) = (2\pi\sigma^2)^{-n/2} \exp\left(-\frac{1}{2\sigma^2}(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})^T(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})\right) \]

This likelihood function represents the probability of observing the data \(\mathbf{y}\) given the parameters \(\boldsymbol{\beta}\) and the predictors \(\mathbf{X}\), under the assumption of normally distributed errors.