QBoard » Artificial Intelligence & ML » AI and ML - Tensorflow » How to tell which Keras model is better?

How to tell which Keras model is better?

  • I don't understand which accuracy in the output to use to compare my 2 Keras models to see which one is better.

    Do I use the "acc" (from the training data?) one or the "val acc" (from the validation data?) one?

    There are different accs and val accs for each epoch. How do I know the acc or val acc for my model as a whole? Do I average all of the epochs accs or val accs to find the acc or val acc of the model as a whole?

    Model 1 Output

    Train on 970 samples, validate on 243 samples
    Epoch 1/20
    0s - loss: 0.1708 - acc: 0.7990 - val_loss: 0.2143 - val_acc: 0.7325
    Epoch 2/20
    0s - loss: 0.1633 - acc: 0.8021 - val_loss: 0.2295 - val_acc: 0.7325
    Epoch 3/20
    0s - loss: 0.1657 - acc: 0.7938 - val_loss: 0.2243 - val_acc: 0.7737
    Epoch 4/20
    0s - loss: 0.1847 - acc: 0.7969 - val_loss: 0.2253 - val_acc: 0.7490
    Epoch 5/20
    0s - loss: 0.1771 - acc: 0.8062 - val_loss: 0.2402 - val_acc: 0.7407
    Epoch 6/20
    0s - loss: 0.1789 - acc: 0.8021 - val_loss: 0.2431 - val_acc: 0.7407
    Epoch 7/20
    0s - loss: 0.1789 - acc: 0.8031 - val_loss: 0.2227 - val_acc: 0.7778
    Epoch 8/20
    0s - loss: 0.1810 - acc: 0.8010 - val_loss: 0.2438 - val_acc: 0.7449
    Epoch 9/20
    0s - loss: 0.1711 - acc: 0.8134 - val_loss: 0.2365 - val_acc: 0.7490
    Epoch 10/20
    0s - loss: 0.1852 - acc: 0.7959 - val_loss: 0.2423 - val_acc: 0.7449
    Epoch 11/20
    0s - loss: 0.1889 - acc: 0.7866 - val_loss: 0.2523 - val_acc: 0.7366
    Epoch 12/20
    0s - loss: 0.1838 - acc: 0.8021 - val_loss: 0.2563 - val_acc: 0.7407
    Epoch 13/20
    0s - loss: 0.1835 - acc: 0.8041 - val_loss: 0.2560 - val_acc: 0.7325
    Epoch 14/20
    0s - loss: 0.1868 - acc: 0.8031 - val_loss: 0.2573 - val_acc: 0.7407
    Epoch 15/20
    0s - loss: 0.1829 - acc: 0.8072 - val_loss: 0.2581 - val_acc: 0.7407
    Epoch 16/20
    0s - loss: 0.1878 - acc: 0.8062 - val_loss: 0.2589 - val_acc: 0.7407
    Epoch 17/20
    0s - loss: 0.1833 - acc: 0.8072 - val_loss: 0.2613 - val_acc: 0.7366
    Epoch 18/20
    0s - loss: 0.1837 - acc: 0.8113 - val_loss: 0.2605 - val_acc: 0.7325
    Epoch 19/20
    0s - loss: 0.1906 - acc: 0.8010 - val_loss: 0.2555 - val_acc: 0.7407
    Epoch 20/20
    0s - loss: 0.1884 - acc: 0.8062 - val_loss: 0.2542 - val_acc: 0.7449

    Model 2 Output

    Train on 970 samples, validate on 243 samples
    Epoch 1/20
    0s - loss: 0.1735 - acc: 0.7876 - val_loss: 0.2386 - val_acc: 0.6667
    Epoch 2/20
    0s - loss: 0.1733 - acc: 0.7825 - val_loss: 0.1894 - val_acc: 0.7449
    Epoch 3/20
    0s - loss: 0.1781 - acc: 0.7856 - val_loss: 0.2028 - val_acc: 0.7407
    Epoch 4/20
    0s - loss: 0.1717 - acc: 0.8021 - val_loss: 0.2545 - val_acc: 0.7119
    Epoch 5/20
    0s - loss: 0.1757 - acc: 0.8052 - val_loss: 0.2252 - val_acc: 0.7202
    Epoch 6/20
    0s - loss: 0.1776 - acc: 0.8093 - val_loss: 0.2449 - val_acc: 0.7490
    Epoch 7/20
    0s - loss: 0.1833 - acc: 0.7897 - val_loss: 0.2272 - val_acc: 0.7572
    Epoch 8/20
    0s - loss: 0.1827 - acc: 0.7928 - val_loss: 0.2376 - val_acc: 0.7531
    Epoch 9/20
    0s - loss: 0.1795 - acc: 0.8062 - val_loss: 0.2445 - val_acc: 0.7490
    Epoch 10/20
    0s - loss: 0.1746 - acc: 0.8103 - val_loss: 0.2491 - val_acc: 0.7449
    Epoch 11/20
    0s - loss: 0.1831 - acc: 0.8082 - val_loss: 0.2477 - val_acc: 0.7449
    Epoch 12/20
    0s - loss: 0.1831 - acc: 0.8113 - val_loss: 0.2496 - val_acc: 0.7490
    Epoch 13/20
    0s - loss: 0.1920 - acc: 0.8000 - val_loss: 0.2459 - val_acc: 0.7449
    Epoch 14/20
    0s - loss: 0.1945 - acc: 0.7928 - val_loss: 0.2446 - val_acc: 0.7490
    Epoch 15/20
    0s - loss: 0.1852 - acc: 0.7990 - val_loss: 0.2459 - val_acc: 0.7449
    Epoch 16/20
    0s - loss: 0.1800 - acc: 0.8062 - val_loss: 0.2495 - val_acc: 0.7449
    Epoch 17/20
    0s - loss: 0.1891 - acc: 0.8000 - val_loss: 0.2469 - val_acc: 0.7449
    Epoch 18/20
    0s - loss: 0.1891 - acc: 0.8041 - val_loss: 0.2467 - val_acc: 0.7531
    Epoch 19/20
    0s - loss: 0.1853 - acc: 0.8072 - val_loss: 0.2511 - val_acc: 0.7449
    Epoch 20/20
    0s - loss: 0.1905 - acc: 0.8062 - val_loss: 0.2460 - val_acc: 0.7531

     

      December 9, 2020 6:26 PM IST
    0
  • You need to key on decreasing val_loss or increasing val_acc, ultimately it doesn't matter much. The differences are well within random/rounding errors.

    In practice, the training loss can drop significantly due to over-fitting, which is why you want to look at validation loss.

    In your case, you can see that your training loss is not dropping - which means you are learning nothing after each epoch. It look like there's nothing to learn in this model, aside from some trivial linear-like fit or cutoff value.

    Also, when learning nothing, or a trivial linear thing, you should a similar performance on training and validation (trivial learning is always generalizable). You should probably shuffle your data before using the validation_split feature.

      July 31, 2021 4:17 PM IST
    0
  • Do I use the "acc" (from the training data?) one or the "val acc" (from the validation data?) one?

    If you want to estimate the ability of your model to generalize to new data (which is probably what you want to do), then you look at the validation accuracy, because the validation split contains only data that the model never sees during the training and therefor cannot just memorize.

    If your training data accuracy ("acc") keeps improving while your validation data accuracy ("val_acc") gets worse, you are likely in an overfitting situation, i.e. your model starts to basically just memorize the data.

    There are different accs and val accs for each epoch. How do I know the acc or val acc for my model as a whole? Do I average all of the epochs accs or val accs to find the acc or val acc of the model as a whole?

    Each epoch is a training run over all of your data. During that run the parameters of your model are adjusted according to your loss function. The result is a set of parameters which have a certain ability to generalize to new data. That ability is reflected by the validation accuracy. So think of every epoch as its own model, which can get better or worse if it is trained for another epoch. Whether it got better or worse is judged by the change in validation accuracy (better = validation accuracy increased). Therefore pick the model of the epoch with the highest validation accuracy. Don't average the accuracies over different epochs, that wouldn't make much sense. You can use the Keras callback ModelCheckpoint to automatically save the model with the highest validation accuracy (see callbacks documentation).

    The highest accuracy in model 1 is 0.7737 and the highest one in model 2 is 0.7572. Therefore you should view model 1 (at epoch 3) as better. Though it is possible that the 0.7737 was just a random outlier.

    This post was edited by Viaan Prakash at December 24, 2020 11:34 AM IST
      December 24, 2020 11:30 AM IST
    0
  • You can find all the code used in this post at [GitHub].,My approach is straightforward, as shown in the image below. We will use Python to find all the pre-trained models in Keras and then loop over them one by one. We’ll train the models on the TensorFlow [cats_vs_dogs] dataset in this post. You can replace it with any other dataset, including your own custom dataset.,Deciding which pre-trained model to use in your Deep Learning task ranks at the same level of classical dilemmas like what movie to watch on Netflix and what cereal to buy at the supermarket (P.S. buy the one with the least sugar and highest fiber content). This post will use a data-driven approach in Python to find out the best Keras Pre-Trained model for the cats_vs_dogs dataset. This post and the code provided will also help you easily choose the best Pre-Trained model for your problem’s dataset.,Finally, extensive experimentation and data-driven decision making are the keys to success in all Machine Learning applications. I hope this post stimulates you to think of ways to make more data-driven decisions in your daily work.

    Num train images: 16283 Num validation images: 6979 Num classes: 2 Num iterations per epoch: 508
    
    pip uninstall h5pypip install h5py < '3.0.0'
    
    | model_name | num_model_params | validation_accuracy || -- -- -- -- -- -- -- -- -- - | -- -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- -- -- - || MobileNetV2 | 2257984 | 0.9475569725036621 || MobileNet | 3228864 | 0.9773606657981873 || NASNetMobile | 4269716 | 0.9753546118736267 || DenseNet121 | 7037504 | 0.9273535013198853 || DenseNet169 | 12642880 | 0.95572429895401 || VGG16 | 14714688 | 0.9107322096824646 || DenseNet201 | 18321984 | 0.9419687390327454 || VGG19 | 20024384 | 0.8948273658752441 || Xception | 20861480 | 0.9550078511238098 || InceptionV3 | 21802784 | 0.9859578609466553 || ResNet50V2 | 23564800 | 0.9802263975143433 || ResNet50 | 23587712 | 0.49620288610458374 || ResNet101V2 | 42626560 | 0.9878206253051758 || ResNet101 | 42658176 | 0.49620288610458374 || InceptionResNetV2 | 54336736 | 0.9885370135307312 || ResNet152V2 | 58331648 | 0.9840951561927795 || ResNet152 | 58370944 | 0.49620288610458374 || NASNetLarge | 84916818 | 0.9795099496841431 |
    
      December 20, 2021 12:06 PM IST
    0