Multinomial Distribution

title: Multinomial Distribution (Definition)
We say the vector $(X_1, X_2, . . . , X_k ) ∼ Multi(n, p_1, p_2, \dots, p_k)$ follows a Multinomial Distribution if:  
1. There are $n$ trials, where each trial has $k$ possible outcomes with probabilities $p_i$, with $i = 1, 2, \cdots, k$ and $\sum_{i=1}^k p_k = 1$
2. The trials are independent.
3. $X_i$ is the number of successes of type $i$ in $n$ total trials, so $\sum_{i=1}^k X_i = n$

The joint pmf is given by


Conditionals I a little iffy about what this means, they gave this example, but I don’t think it’se worthy to put it in my notes.


Serendipity learned this from STAT206, and now I am applying it from Andrej Karpathy.

torch.multinomial(p, num_samples=20, replacement=True, generator=g)

We use the probability distribution to generate a tensor of indices.