Information Theory

Fundamental Properties of Information

Learned about this while trying to understand why Information Theory uses log to characterize entropy.

This is also talked about in wikipedia here: https://en.wikipedia.org/wiki/Entropy_(information_theory)#Characterization

Suppose you want a function that tells you how much “information” is gained when observing an event with probability ppp.

Shannon proved that if you want this function to satisfy the following properties:

  1. Non-negativity: , and (no surprise for certain events). Events that always occur do not communicate information.
  2. Monotonicity: Rarer events carry more information:
  3. Additivity: The information of independent events multiplies probabilities, i.e. should add information:

A function that satisfies all these is:

Intuition behind additivity:

  • Imagine a coin flip. Every time you get the result of coin flip, there’s more entropy to the system. So these coin flips should add information