Distribution
For each distribution, we try to figure out:
- What is its support?
- What is the p.m.f.?
- What are the parameters?
- What is ? What is and s.d.?
- Why is it important?
Discrete vs. Continuous Distributions
Discrete | Continuous | |
---|---|---|
Expected Value | ||
Variance | ||
pmf to cdf | ||
cdf to pdf |
Distributions
- Bernoulli Distribution
- Binomial Distribution
- Geometric Distribution
- Negative Binomial Distribution
- Normal Distribution
- Hypergeometric Distribution
- Poisson Distribution
- Multinomial Distribution
- Cauchy-Lorentz Distribution
Concepts
Measures of dispersion and symmetry
- Range = max - min
- IQR (inter-quartile range) =
- Variance (standard dev),
- Skewness - measures bias towards left or right side
- Kurtosis - measure of normality
1D Distribution
I was having trouble understanding what a 1D Distribution was, since I needed to use that to calculate the Wasserstein Metric. Um, I think itβs just with one variable.
2D Distribution
This is where we talk about multivariate distributions. https://en.wikipedia.org/wiki/Multivariate_normal_distribution
Distance between Distributions
How do you quantify how far away two distributions are? There seems to be quite a few methods, that I found from stackoverflow
I ran into this problem while working on my Poker AI, since Euclidean distance is not the best measure.
Heuristic
- Minkowski-form
- Weighted-Mean-Variance (WMV)
Nonparametric test statistics
- 2 (Chi Square)
- Kolmogorov-Smirnov (KS)
- Cramer/von Mises (CvM)
Information-theory divergences
- KL Divergence
- JensenβShannon divergence (metric)
- Jeffrey-divergence (numerically stable and symmetric)
Ground distance measures
- Histogram intersection
- Quadratic form (QF)
- Earth Moverβs Distance