QBoard » Artificial Intelligence & ML » AI and ML - PyTorch » How does the “view” method work in PyTorch?

How does the “view” method work in PyTorch?

  • I am confused about the method view() in the following code snippet.
    class Net(nn.Module):
        def __init__(self):
            super(Net, self).__init__()
            self.conv1 = nn.Conv2d(3, 6, 5)
            self.pool  = nn.MaxPool2d(2,2)
            self.conv2 = nn.Conv2d(6, 16, 5)
            self.fc1   = nn.Linear(16*5*5, 120)
            self.fc2   = nn.Linear(120, 84)
            self.fc3   = nn.Linear(84, 10)
    
        def forward(self, x):
            x = self.pool(F.relu(self.conv1(x)))
            x = self.pool(F.relu(self.conv2(x)))
            x = x.view(-1, 16*5*5)
            x = F.relu(self.fc1(x))
            x = F.relu(self.fc2(x))
            x = self.fc3(x)
            return x
    
    net = Net()

    My confusion is regarding the following line.

    x = x.view(-1, 16*5*5)

     

    What does tensor.view() function do? I have seen its usage in many places, but I can't understand how it interprets its parameters.

    What happens if I give negative values as parameters to the view() function? For example, what happens if I call, tensor_variable.view(1, 1, -1)?

    Can anyone explain the main principle of view() function with some examples?
      December 11, 2020 2:06 PM IST
    0
  • The view function is meant to reshape the tensor.

    Say you have a tensor

    import torch
    a = torch.range(1, 16)

    a is a tensor that has 16 elements from 1 to 16(included). If you want to reshape this tensor to make it a 4 x 4 tensor then you can use

    a = a.view(4, 4)

    Now a will be a 4 x 4 tensor. Note that after the reshape the total number of elements need to remain the same. Reshaping the tensor a to a 3 x 5 tensor would not be appropriate.

    What is the meaning of parameter -1?

    If there is any situation that you don't know how many rows you want but are sure of the number of columns, then you can specify this with a -1. (Note that you can extend this to tensors with more dimensions. Only one of the axis value can be -1). This is a way of telling the library: "give me a tensor that has these many columns and you compute the appropriate number of rows that is necessary to make this happen".

    This can be seen in the neural network code that you have given above. After the line 

    x = self.pool(F.relu(self.conv2(x)))

    in the forward function, you will have a 16 depth feature map. You have to flatten this to give it to the fully connected layer. So you tell pytorch to reshape the tensor you obtained to have specific number of columns and tell it to decide the number of rows by itself.

    Drawing a similarity between numpy and pytorch, view is similar to numpy's reshape function.

     

      December 22, 2020 1:31 PM IST
    0
  • Let's try to understand view by the following examples:

        a=torch.range(1,16)
    
    print(a)
    
        tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13., 14.,
                15., 16.])
    
    print(a.view(-1,2))
    
        tensor([[ 1.,  2.],
                [ 3.,  4.],
                [ 5.,  6.],
                [ 7.,  8.],
                [ 9., 10.],
                [11., 12.],
                [13., 14.],
                [15., 16.]])
    
    print(a.view(2,-1,4))   #3d tensor
    
        tensor([[[ 1.,  2.,  3.,  4.],
                 [ 5.,  6.,  7.,  8.]],
    
                [[ 9., 10., 11., 12.],
                 [13., 14., 15., 16.]]])
    print(a.view(2,-1,2))
    
        tensor([[[ 1.,  2.],
                 [ 3.,  4.],
                 [ 5.,  6.],
                 [ 7.,  8.]],
    
                [[ 9., 10.],
                 [11., 12.],
                 [13., 14.],
                 [15., 16.]]])
    
    print(a.view(4,-1,2))
    
        tensor([[[ 1.,  2.],
                 [ 3.,  4.]],
    
                [[ 5.,  6.],
                 [ 7.,  8.]],
    
                [[ 9., 10.],
                 [11., 12.]],
    
                [[13., 14.],
                 [15., 16.]]])
    

    -1 as an argument value is an easy way to compute the value of say x provided we know values of y, z or the other way round in case of 3d and for 2d again an easy way to compute the value of say x provided we know values of y or vice versa..

      December 22, 2020 2:46 PM IST
    0
  • I figured it out that x.view(-1, 16 * 5 * 5) is equivalent to x.flatten(1), where the parameter 1 indicates the flatten process starts from the 1st dimension(not flattening the 'sample' dimension) As you can see, the latter usage is semantically more clear and easier to use, so I prefer flatten().
      December 22, 2020 2:49 PM IST
    0
  • I would like to add a small insight to how elements are ordered for .view(...)

    • For a Tensor with shape (a,b,c), the order of it's elements are determined by a numbering system: where the first digit has a numbers, second digit has b numbers and third digit has c numbers.
    • The mapping of the elements in the new Tensor returned by .view(...) preserves this order of the original Tensor.
      January 29, 2022 2:50 PM IST
    0