QBoard » Artificial Intelligence & ML » AI and ML - Tensorflow » Loading a trained Keras model and continue training

Loading a trained Keras model and continue training

  • I was wondering if it was possible to save a partly trained Keras model and continue the training after loading the model again.

    The reason for this is that I will have more training data in the future and I do not want to retrain the whole model again.

    The functions which I am using are:

    #Partly train model
    model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)
    
    #Save partly trained model
    model.save('partly_trained.h5')
    
    #Load partly trained model
    from keras.models import load_model
    model = load_model('partly_trained.h5')
    
    #Continue training
    model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)

    Edit 1: added fully working example

    With the first dataset after 10 epochs the loss of the last epoch will be 0.0748 and the accuracy 0.9863.

    After saving, deleting and reloading the model the loss and accuracy of the model trained on the second dataset will be 0.1711 and 0.9504 respectively.

    Is this caused by the new training data or by a completely re-trained model?

    """
    Model by: http://machinelearningmastery.com/
    """
    # load (downloaded if needed) the MNIST dataset
    import numpy
    from keras.datasets import mnist
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.utils import np_utils
    from keras.models import load_model
    numpy.random.seed(7)
    
    def baseline_model():
        model = Sequential()
        model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
        model.add(Dense(num_classes, init='normal', activation='softmax'))
        model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
        return model
    
    if __name__ == '__main__':
        # load data
        (X_train, y_train), (X_test, y_test) = mnist.load_data()
    
        # flatten 28*28 images to a 784 vector for each image
        num_pixels = X_train.shape[1] * X_train.shape[2]
        X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
        X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
        # normalize inputs from 0-255 to 0-1
        X_train = X_train / 255
        X_test = X_test / 255
        # one hot encode outputs
        y_train = np_utils.to_categorical(y_train)
        y_test = np_utils.to_categorical(y_test)
        num_classes = y_test.shape[1]
    
        # build the model
        model = baseline_model()
    
        #Partly train model
        dataset1_x = X_train[:3000]
        dataset1_y = y_train[:3000]
        model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)
    
        # Final evaluation of the model
        scores = model.evaluate(X_test, y_test, verbose=0)
        print("Baseline Error: %.2f%%" % (100-scores[1]*100))
    
        #Save partly trained model
        model.save('partly_trained.h5')
        del model
    
        #Reload model
        model = load_model('partly_trained.h5')
    
        #Continue training
        dataset2_x = X_train[3000:]
        dataset2_y = y_train[3000:]
        model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
        scores = model.evaluate(X_test, y_test, verbose=0)
        print("Baseline Error: %.2f%%" % (100-scores[1]*100))

     

     

      December 8, 2020 11:22 AM IST
    0
  • The problem might be that you use a different optimizer - or different arguments to your optimizer. I just had the same issue with a custom pretrained model, using
    reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
                                  patience=patience, min_lr=min_lr, verbose=1)

    for the pretrained model, whereby the original learning rate starts at 0.0003 and during pre-training it is reduced to the min_learning rate, which is 0.000003

    I just copied that line over to the script which uses the pre-trained model and got really bad accuracies. Until I noticed that the last learning rate of the pretrained model was the min learning rate, i.e. 0.000003. And if I start with that learning rate, I get exactly the same accuracies to start with as the output of the pretrained model - which makes sense, as starting with a learning rate that is 100 times bigger than the last learning rate used in the pretrained model will result in a huge overshoot of GD and hence in heavily decreased accuracies.

      December 8, 2020 11:38 AM IST
    0
  • Most of the above answers covered important points. If you are using recent Tensorflow (TF2.1 or above), Then the following example will help you. The model part of the code is from Tensorflow website.

    import tensorflow as tf
    from tensorflow import keras
    mnist = tf.keras.datasets.mnist
    
    (x_train, y_train),(x_test, y_test) = mnist.load_data()
    x_train, x_test = x_train / 255.0, x_test / 255.0
    
    def create_model():
      model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),  
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax)
        ])
    
      model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
      return model
    
    # Create a basic model instance
    model=create_model()
    model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)​


    Please save the model in *.tf format. From my experience, if you have any custom_loss defined, *.h5 format will not save optimizer status and hence will not serve your purpose if you want to retrain the model from where we left.

    # saving the model in tensorflow format
    model.save('./MyModel_tf',save_format='tf')
    
    
    # loading the saved model
    loaded_model = tf.keras.models.load_model('./MyModel_tf')
    
    # retraining the model
    loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)​


    This approach will restart the training where we left before saving the model. As mentioned by others, if you want to save weights of best model or you want to save weights of model every epoch you need to use keras callbacks function (ModelCheckpoint) with options such as save_weights_only=True, save_freq='epoch', and save_best_only.
      August 28, 2021 1:36 PM IST
    0
  • All above helps, you must resume from same learning rate() as the LR when the model and weights were saved. Set it directly on the optimizer.

    Note that improvement from there is not guaranteed, because the model may have reached the local minimum, which may be global. There is no point to resume a model in order to search for another local minimum, unless you intent to increase the learning rate in a controlled fashion and nudge the model into a possibly better minimum not far away.

      August 30, 2021 1:25 PM IST
    0