MAP estimate of a linear model

This one is simple, elegant and a nice bridge between the frequentist and Bayesian worlds.

\begin{aligned} θ_{MAP} & = \arg max_{θ} p (θ | D) \\ = \arg max_{θ} \frac{p (D | θ) p (θ)}{p (D)} \\ = \arg max_{θ} p (D | θ) p (θ) \\ = \arg max_{θ} \log p (D | θ) + \log p (θ) \\ = \arg max_{θ} \log \prod_{i = 1}^{N} p (y_{i} | x_{i}, θ) + \log p (θ) \\ = \arg max_{θ} \sum_{i = 1}^{N} \log p (y_{i} | x_{i}, θ) + \log p (θ) \\ = \arg max_{θ} \sum_{i = 1}^{N} \log N (y_{i} | θ^{T} x_{i}, σ^{2}) + \log N (θ | 0, α^{2}) \\ = \arg max_{θ} \sum_{i = 1}^{N} - \frac{1}{2 σ^{2}} (y_{i} - θ^{T} x_{i})^{2} - \frac{1}{2 α^{2}} θ^{T} θ \\ = \arg min_{θ} \sum_{i = 1}^{N} (y_{i} - θ^{T} x_{i})^{2} + \frac{σ^{2}}{α^{2}} θ^{T} θ \\ = \arg min_{θ} \sum_{i = 1}^{N} (y_{i} - θ^{T} x_{i})^{2} + λ θ^{T} θ \end{aligned}

Which is the well-known Ridge regression objective. $λ = \frac{σ^{2}}{α^{2}}$ is the regularisation parameter.