I was looking for alternative ways to save a trained model in PyTorch. So far, I have found two alternatives.
torch.save() to save a model and torch.load() to load a... moreI was looking for alternative ways to save a trained model in PyTorch. So far, I have found two alternatives.
torch.save() to save a model and torch.load() to load a model.
model.state_dict() to save a trained model and model.load_state_dict() to load the saved model.
I have come across to this discussion where approach 2 is recommended over approach 1.
My question is, why the second approach is preferred? Is it only because torch.nn modules have those two function and we are encouraged to use them?
I am confused about the method view() in the following code snippet.
class Net(nn.Module):
def __init__(self):
super(Net,... moreI am confused about the method view() in the following code snippet.
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2,2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16*5*5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16*5*5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
My confusion is regarding the following line.
x = x.view(-1, 16*5*5)
What does tensor.view() function do? I have seen its usage in many places, but I can't understand how it interprets its parameters.What happens if I give negative values as parameters to the view() function? For example, what happens if I call, tensor_variable.view(1, 1, -1)?Can... less
I am training on 970 samples and validating on 243 samples.
How big should batch size and number of epochs be when fitting a model in Keras to optimize the val_acc? Is there any... moreI am training on 970 samples and validating on 243 samples.
How big should batch size and number of epochs be when fitting a model in Keras to optimize the val_acc? Is there any sort of rule of thumb to use based on data input size?