# Maximum Likelihood

Let start by the formal definition of the likelihood function:

$L_{n}(\theta )=L_{n}(\theta ;\mathbf {y} )=f_{n}(\mathbf {y} ;\theta )$

where $$\theta$$ is a vector of parameters $$\theta =\left[\theta _{1},\,\theta _{2},\,\ldots ,\,\theta _{k}\right]^{\mathsf {T}}$$ and $$\mathbf{y}$$ densities at the observed data sample $$\mathbf {y} =(y_{1},y_{2},\ldots ,y_{n})$$

The function $$f_{n}(\mathbf {y} ;\theta)$$ is simply the product of density functions.

Intuitively the idea of the maximum likelihood estimation is to find models parameters that maximize the likelihood:

${\hat {\theta }={\underset {\theta \in \Theta }{\operatorname {arg\;max} }}\ {\widehat {L}}_{n}(\theta \,;\mathbf {y} )}$