# Bayesian Inference

Bayesian inference was created to update the probability as we gather more data.

The basic setup is the following:

- We start with some guess about the distribution of $θ$ (the unknown quantity we want to estimate), which we call the Prior Distribution
- Then, we observe data data and update our guessed distribution using Bayes’ Rule

Recommendation by soham: https://www.probabilitycourse.com/chapter9/9_1_0_bayesian_inference.php

The core of Bayesian Inference is to combine two different distributions (likelihood and Prior Probability) into one “smarter” distribution (Posterior Probability).

Bayesian Inference has three steps:

- Calculate the Prior: Choose a PDF to model your parameter $θ$, aka the prior distribution $P(θ)$. This is your best guess about parameters before seeing the data $X$.
- Calculate the Likelihood: Choose a PDF for $P(X∣θ)$. Basically you are modeling how the data $X$ will look like given the parameter θ.
- Calculate the Posterior: Calculate the posterior distribution $P(θ∣X)$ and pick the $θ$ that has the highest $P(θ∣X)$.

It seems like a lot of modern Machine Learning techniques use Bayesian Inference.