QBoard » Artificial Intelligence & ML » AI and ML - Python » How to interpret “loss” and “accuracy” for a machine learning model

How to interpret “loss” and “accuracy” for a machine learning model

  • When I trained my neural network with Theano or Tensorflow, they will report a variable called "loss" per epoch.

    How should I interpret this variable? Higher loss is better or worse, or what does it mean for the final performance (accuracy) of my neural network?

      August 21, 2020 4:41 PM IST
    0
  • A loss function is used to optimize a machine learning algorithm. The loss is calculated on training and validation and its interpretation is based on how well the model is doing in these two sets. It is the sum of errors made for each example in training or validation sets. Loss value implies how poorly or well a model behaves after each iteration of optimization.

    An accuracy metric is used to measure the algorithm’s performance in an interpretable way. The accuracy of a model is usually determined after the model parameters and is calculated in the form of a percentage. It is the measure of how accurate your model's prediction is compared to the true data.

    Example-
    Suppose you have 1000 test samples and if your model is able to classify 990 of them correctly, then the model’s accuracy will be 99.0%.
      September 12, 2020 3:31 PM IST
    0
  • Training a model simply means learning (determining) good values for all the weights and the bias from labeled examples. 

    Loss is the result of a bad prediction. A loss is a number indicating how bad the model's prediction was on a single example.

    If the model's prediction is perfect, the loss is zero; otherwise, the loss is greater. The goal of training a model is to find a set of weights and biases that have low loss, on average, across all examples. Higher loss is the worse(bad prediction) for any model.

    The loss is calculated on training and validation and its interpretation is how well the model is doing for these two sets. Unlike accuracy, a loss is not a percentage. It is a sum of the errors made for each example in training or validation sets.

    In the following diagrams, there are two graphs representing the losses of two different models, the left graph has a high loss and the right graph has a low loss.   



    • The arrows represent a loss.

    • The blue lines represent predictions.

     

    Hope this helps!

      September 12, 2020 4:27 PM IST
    0
  • The lower the loss, the better a model (unless the model has over-fitted to the training data). The loss is calculated on training and validation and its interperation is how well the model is doing for these two sets. Unlike accuracy, loss is not a percentage. It is a summation of the errors made for each example in training or validation sets.

    In the case of neural networks, the loss is usually negative log-likelihood and residual sum of squares for classification and regression respectively. Then naturally, the main objective in a learning model is to reduce (minimize) the loss function's value with respect to the model's parameters by changing the weight vector values through different optimization methods, such as backpropagation in neural networks.

    Loss value implies how well or poorly a certain model behaves after each iteration of optimization. Ideally, one would expect the reduction of loss after each, or several, iteration(s).

    The accuracy of a model is usually determined after the model parameters are learned and fixed and no learning is taking place. Then the test samples are fed to the model and the number of mistakes (zero-one loss) the model makes are recorded, after comparison to the true targets. Then the percentage of misclassification is calculated.

    For example, if the number of test samples is 1000 and model classifies 952 of those correctly, then the model's accuracy is 95.2%.

    There are also some subtleties while reducing the loss value. For instance, you may run into the problem of over-fitting in which the model "memorizes" the training examples and becomes kind of ineffective for the test set. Over-fitting also occurs in cases where you do not employ a regularization, you have a very complex model (the number of free parameters W is large) or the number of data points N is very low.

    This post was edited by Viaan Prakash at August 21, 2020 4:50 PM IST
      August 21, 2020 4:49 PM IST
    0
    • Shivakumar Kota
      Shivakumar Kota Hi @Viaan Prakash, thanks for your very details explanation..
      August 21, 2020
  • The training set is used to perform the initial training of the model, initializing the weights of the neural network.

    The validation set is used after the neural network has been trained. It is used for tuning the network's hyperparameters, and comparing how changes to them affect the predictive accuracy of the model. Whereas the training set can be thought of as being used to build the neural network's gate weights, the validation set allows fine tuning of the parameters or architecture of the neural network model. It's useful as it allows repeatable comparison of these different parameters/architectures against the same data and networks weights, to observe how parameter/architecture changes affect the predictive power of the network.

    Then the test set is used only to test the predictive accuracy of the trained neural network on previously unseen data, after training and parameter/architecture selection with the training and validation data sets.

      August 28, 2020 12:59 PM IST
    0