QBoard » Artificial Intelligence & ML » AI and ML - PyTorch » What is model fine-tuning?

What is model fine-tuning?

  • What is model fine-tuning
      August 24, 2021 4:47 PM IST
    0
  • Fine-tuning takes a model that has already been trained for a particular task and then fine-tuning or tweaking it to make it perform a second similar task. For example, a deep learning network that has been used to recognize cars can be fine-tuned to recognize trucks.
      September 2, 2021 4:53 PM IST
    0
  • Finetuning means taking weights of a trained neural network and use it as initialization for a new model being trained on data from the same domain (often e.g. images). It is used to:

    1. speed up the training
    2. overcome small dataset size

    There are various strategies, such as training the whole initialized network or "freezing" some of the pre-trained weights (usually whole layers). The article A Comprehensive guide to Fine-tuning Deep Learning Models in Keras provides a good insight into this. Also have a look at the following threads:

      September 3, 2021 1:25 PM IST
    0
  • Fine-tuning is a way of applying or utilizing transfer learning. Specifically, fine-tuning is a process that takes a model that has already been trained for one given task and then tunes or tweaks the model to make it perform a second similar task.

    Why Use Fine-Tuning?

    Assuming the original task is similar to the new task, using an artificial neural network that has already been designed and trained allows us to take advantage of what the model has already learned without having to develop it from scratch.

    When building a model from scratch, we usually must try many approaches through trial-and-error.

     

    For example, we have to choose how many layers we're using, what types of layers we're using, what order to put the layers in, how many nodes to include in each layer, decide how much regularization to use, what to set our learning rate as, etc.

    • Number of layers
    • Types of layers
    • Order of layers
    • Number of nodes in each layer
    • How much regularization to use
    • Learning rate

    Building and validating our model can be a huge task in its own right, depending on what data we're training it on.

    This is what makes the fine-tuning approach so attractive. If we can find a trained model that already does one task well, and that task is similar to ours in at least some remote way, then we can take advantage of everything the model has already learned and apply it to our specific task.

    Now, of course, if the two tasks are different, then there will be some information that the model has learned that may not apply to our new task, or there may be new information that the model needs to learn from the data regarding the new task that wasn't learned from the previous task.

      September 7, 2021 1:31 PM IST
    0
  • Fine-tuning takes a model that has already been trained for a particular task and then fine-tuning or tweaking it to make it perform a second similar task. For example, a deep learning network that has been used to recognize cars can be fine-tuned to recognize trucks.

    Since the input information for the new neural network is similar to a pre-existing deep learning model, it becomes a relatively easy task to program the new model. The first step includes importing the data of the existing similar deep learning network. The second step involves removing the output layer of the network as it was programmed for tasks specific to the previous model. If we continue with the previous example, the output layer was programmed to recognize whether a given image was a car or not. However, since our new model requires the deep learning neural network to determine whether the given image is a truck or not; the older output layer becomes unusable. Hence, we need to remove the output layer. The third step is optional and depends upon the similarities of both the learning models. You may require to add or remove certain layers depending upon the similarities of the two models. Once you’ve added or removed layers depending upon the data required, you must then freeze the layers in the new model. Freezing a layer means the layer doesn’t need any modification to the data contained in them, henceforth. The weights for these layers don’t update when we train the new model on the new data for the new task.

     

    The final step involves training the model on the new data. The input layer needs to be modified to train the deep learning network to identify trucks. The weights of all the other layers stay the same, and only the input layer is trained on the new model. The output layer then is trained to display the result intended for the new deep learning neural network. The result of this new model will be to display whether the given image is a truck or not. Thus, using data from a deep learning neural network for identifying cars, we can easily train a new network to identify trucks. These two neural networks carry out different tasks, yet are programmed on similar data.

    Why fine-tuning deep learning model is necessary

    Whenever we are given the task of training a deep learning neural network, we usually think of training it from scratch. Training a neural network is a time and resource-intensive process. The neural network needs to be fed tons of data for it to actually work as intended. Gathering the data for the neural network can take long periods of time. With fine-tuning, the deep learning neural networks already have most of the data available for the new model from previous ones. Thus, a lot of time and resources are saved when fine-tuning deep learning models is carried out.

     

    Fine-tuning deep learning models can also help when the data available for a new deep learning model is limited. For example, the new deep learning model might not have new data to begin with, and thus training such a model can prove to be a problem. With fine-tuning, most of the missing data can be incorporated from previous models, making the training process much easier. For example, if you want to program a deep learning model to identify trucks, there might not be enough data available for it. But, one can utilize images of vehicles or cars specifically so that the deep learning model can recognize the basic features of a vehicle. The truck-specific features can then be recognized with the other data.

     

    Fine-tuning deep learning models also provide ease of transferring knowledge. The available data from a previous deep learning neural network can easily be imported for the new model. It can include the input layer or a combination of the input layer and the hidden layers. The data can be imported into the new deep learning model fairly easily. Slight modifications might be required to the imported layers to work according to the new deep learning model.
      August 26, 2021 2:03 PM IST
    0