QBoard » Artificial Intelligence & ML » AI and ML - Tensorflow » Does model.compile() initialize all the weights and biases in Keras (tensorflow backend)?

Does model.compile() initialize all the weights and biases in Keras (tensorflow backend)?

  • When I start training a model, there is no model saved previously. I can use model.compile() safely. I have now saved the model in a h5 file for further training using checkpoint.

    Say, I want to train the model further. I am confused at this point: can I use model.compile() here? And should it be placed before or after the model = load_model() statement? If model.compile() reinitializes all the weights and biases, I should place it before model = load_model() statement.

    After discovering some discussions, it seems to me that model.compile() is only needed when I have no model saved previously. Once I have saved the model, there is no need to use model.compile(). Is it true or false? And when I want to predict using the trained model, should I use model.compile() before predicting?

      August 3, 2021 10:51 PM IST
    0
  • When to use?

    If you're using compile, surely it must be after load_model(). After all, you need a model to compile. (PS: load_model automatically compiles the model with the optimizer that was saved along with the model)

    What does compile do?

    Compile defines the loss function, the optimizer and the metrics. That's all.

    It has nothing to do with the weights and you can compile a model as many times as you want without causing any problem to pretrained weights.

    You need a compiled model to train (because training uses the loss function and the optimizer). But it's not necessary to compile a model for predicting.

    Do you need to use compile more than once?

    Only if:

    You want to change one of these:
    Loss function
    Optimizer / Learning rate
    Metrics
    The trainable property of some layer
    You loaded (or created) a model that is not compiled yet. Or your load/save method didn't consider the previous compilation.
    Consequences of compiling again:

    If you compile a model again, you will lose the optimizer states.

    This means that your training will suffer a little at the beginning until it adjusts the learning rate, the momentums, etc. But there is absolutely no damage to the weights (unless, of course, your initial learning rate is so big that the first training step wildly changes the fine tuned weights).
      August 7, 2021 1:17 PM IST
    0
  • Model.compile(
        optimizer="rmsprop",
        loss=None,
        metrics=None,
        loss_weights=None,
        weighted_metrics=None,
        run_eagerly=None,
        steps_per_execution=None,
        **kwargs
    )

     

    Arguments

    • optimizer: String (name of optimizer) or optimizer instance. See tf.keras.optimizers.
    • loss: Loss function. Maybe be a string (name of loss function), or a tf.keras.losses.Loss instance. See tf.keras.losses. A loss function is any callable with the signature loss = fn(y_true, y_pred), where y_true are the ground truth values, and y_pred are the model's predictions. y_true should have shape (batch_size, d0, .. dN) (except in the case of sparse loss functions such as sparse categorical crossentropy which expects integer arrays of shape (batch_size, d0, .. dN-1)). y_pred should have shape (batch_size, d0, .. dN). The loss function should return a float tensor. If a custom Loss instance is used and reduction is set to None, return value has shape (batch_size, d0, .. dN-1) i.e. per-sample or per-timestep loss values; otherwise, it is a scalar. If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses, unless loss_weights is specified.
    • metrics: List of metrics to be evaluated by the model during training and testing. Each of this can be a string (name of a built-in function), function or a tf.keras.metrics.Metric instance. See tf.keras.metrics. Typically you will use metrics=['accuracy']. A function is any callable with the signature result = fn(y_true, y_pred). To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={'output_a': 'accuracy', 'output_b': ['accuracy', 'mse']}. You can also pass a list to specify a metric or a list of metrics for each output, such as metrics=[['accuracy'], ['accuracy', 'mse']] or metrics=['accuracy', ['accuracy', 'mse']]. When you pass the strings 'accuracy' or 'acc', we convert this to one of tf.keras.metrics.BinaryAccuracytf.keras.metrics.CategoricalAccuracytf.keras.metrics.SparseCategoricalAccuracy based on the loss function used and the model output shape. We do a similar conversion for the strings 'crossentropy' and 'ce' as well.
    • loss_weights: Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model's outputs. If a dict, it is expected to map output names (strings) to scalar coefficients.
    • weighted_metrics: List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.
    • run_eagerly: Bool. Defaults to False. If True, this Model's logic will not be wrapped in a tf.function. Recommended to leave this as None unless your Model cannot be run inside a tf.functionrun_eagerly=True is not supported when using tf.distribute.experimental.ParameterServerStrategy.
    • steps_per_execution: Int. Defaults to 1. The number of batches to run during each tf.function call. Running multiple batches inside a single tf.function call can greatly improve performance on TPUs or small models with a large Python overhead. At most, one full epoch will be run each execution. If a number larger than the size of the epoch is passed, the execution will be truncated to the size of the epoch. Note that if steps_per_execution is set to NCallback.on_batch_begin and Callback.on_batch_end methods will only be called every N batches (i.e. before/after each tf.function execution).
    • **kwargs: Arguments supported for backwards compatibility only.
      October 26, 2021 12:58 PM IST
    0
  • Don't forget that you also need to compile the model after changing the trainable flag of a layer, e.g. when you want to fine-tune a model like this:

    load VGG model without top classifier

    freeze all the layers (i.e. trainable = False)

    add some layers to the top

    compile and train the model on some data

    un-freeze some of the layers of VGG by setting trainable = True

    compile the model again (DON'T FORGET THIS STEP!)

    train the model on some data
      August 14, 2021 1:26 PM IST
    0