# Loss Function §

The loss function helps us define how well/bad a particular model is doing with its predictions. The higher the loss/cost function, the worse it is doing.

With Neural Networks, we use the loss function to do Gradient Descent and backpropagation.

We can differentiate a loss function with respects to a set of particular weights, which is how we can compute .

### Score Function vs. Loss Function §

The Loss Function takes in a Score Function to compute that difference with a truth value.

• A (parameterized) Score Function maps raw data to class scores (e.g. a linear function)
• Loss Function quantifies the agreement between the predicted scores and the ground truth labels. We minimize the loss function with respect to the parameters of the score function.
• Defining a loss function is very important because it tells you you much you much to punish your model for making a mistake:
• L1 Distance is linear, so making a small mistake and large mistake are linearly related
• L2 Distance is squared, so there is a big gap between making small and large mistakes
• In practice, different models use different implementations of loss functions

The final loss combines the Loss Function with Regularization.

I don’t know if this is related, but I am also learning from Andrej Karpathy about: