Focal Loss

Focal loss (what’s “automatic” vs “manual”)

  • The per-example “weight” is computed from the model’s current confidence.
  • It down-weights easy examples automatically via the factor :
    • If the model is confident on the true class ( is high), then is small → loss shrinks.
    • If the model is not confident ( is low), then is larger → loss stays big.

Important nuance

  • The focusing term is the “automatic” part (depends on the model’s prediction).
  • But focal loss often includes , which is a manually chosen class weight (similar idea to weighted cross-entropy).