Likelihood Function
The likelihood function (often simply called the likelihood) is the Joint Probability of the observed data viewed as a function of the parameters of the chosen statistical model.
“Probability of what you see given your model”
- where is the Probability Density Function
So likelihood is just Joint Probability?
When we use the term likelihood, we’re not treating as fixed.
Key idea for MLE:
- is your data
- are your model parameters
The other way around is the Posterior.
“It can be shown1 that minimizing the KL Divergence is equivalent to minimizing the Negative Log Likelihood, which is what we usually do when training a classifier, for example.
- YES I think I finally get that
Likelihood Function (Definition)
If , where are i.i.d. RVs with observations , then
- is the Probability Density Function
- are the parameters we are trying to estimate, depends on the distribution. Ex: for Gaussian Distribution,
Probability vs. Likelihood
- Probability is assigning the probability of a data value given distribution, i.e.
- Likelihood is the
probability of a distribution given data valuesmeasures how well a distribution explains the data, i.e. ,Probability treats data as variable, and likelihood treats parameters as variable.
https://www.youtube.com/watch?v=pYxNSUDSFH4&ab_channel=StatQuestwithJoshStarmer