Similarity Measure

This is a very interesting question, that I have not thought about deeply. It has a wide variety of applications:

  • How do you define that two histograms are similar to each other (like in the Poker AI)?
  • How can you tell that two images are similar to each other?
  • How can you tell that two movies are similar? (this is how Recommendation Systems) works
  • Text similarity (Plagiarism detection)

Naive solution would just be to use the Euclidean Distance, but it falls apart in practice.

For the stanford course, you used an SVM loss which acted as the distance.