Label Smoothing

This is a trick to make your model less confident. Article here

Idea to me introduced by Andrej Karpathy through his lecture.

Basically, you add 1, so that when you take the Negative Log Likelihood, you guarantee that the value of your $x$ is never 0, so then the negative log likelihood would not return $\infty$ .

OMG isn’t this just Laplace Smoothing? Yea basically doing the same thing

🛠️ Steven Gong

Label Smoothing

Graph View

Backlinks