Fundamental Properties of Information
Learned about this while trying to understand why Information Theory uses log to characterize entropy.
This is also talked about in wikipedia here: https://en.wikipedia.org/wiki/Entropy_(information_theory)#Characterization
Suppose you want a function that tells you how much “information” is gained when observing an event with probability ppp.
Shannon proved that if you want this function to satisfy the following properties:
- Non-negativity: , and (no surprise for certain events). Events that always occur do not communicate information.
- Monotonicity: Rarer events carry more information:
- Additivity: The information of independent events multiplies probabilities, i.e. should add information:
- see Log Laws
A function that satisfies all these is:
Intuition behind additivity:
- Imagine a coin flip. Every time you get the result of coin flip, there’s more entropy to the system. So these coin flips should add information