Distance Metric

Defining the distance metric is very important to have a good performing model.

Suppose two objects $x$ and $y$ both have $p$ features $x = (x_{1}, x_{2}, \dots, x_{p})$ $y = (y_{1}, y_{2}, \dots, y_{p})$

From Lecture 13of Carnegie. The Minskowski metric/distance is defined by $d (x, y) = r i = 1 \sum p ∣ x_{i} - y_{i} ∣^{r}$

This is a generalization of two Distance Metrics you are very familiar with in ML:

Other Distance metrics

Hamming Distance
Mahalanobis Distance
Edit Distance: General technique for measuring similarity, where we look at the amount of effort to transform one object into another. OH, Earth Mover’s Distance is an edit distance metric then?

Symmetry: $D (A, B) = D (B, A)$
- Otherwise you can claim that Alex looks like Bob, but Bob looks nothing like Alex
Constancy of Self-Similarity $D (A, A) = 0$ ,
- Otherwise Alex looks more like Bob, that Bob does
Positivity Separation $D (A, B) = 0 ⟺ A = B$
- Otherwise if there are objects in your world that are different, but you cannot tell apart
Triangle Inequality $D (A, B) \leq D (A, C) + D (B, C)$
- Otherwise you could claim Alex is very like Bob, and Alex is very like Carl, but Bob is very unlike Carl

I also see non-negativity on wikipedia:

Taken from wikipedia

Metrics

Total variation distance (sometimes just called “the” statistical distance)
Hellinger distance
Lévy–Prokhorov metric
Wasserstein metric: also known as the Kantorovich metric, or earth mover’s distance
Mahalanobis Distance

Divergences

Kullback-Leibler Divergence
Rényi’s divergence
Jensen–Shannon divergence
Bhattacharyya distance (despite its name it is not a distance, as it violates the triangle inequality)
f-divergence: generalizes several distances and divergences
Discriminability index, specifically the Bayes discriminability index is a positive-definite symmetric measure of the overlap of two distributions.

🛠️ Steven Gong