QBoard » Artificial Intelligence & ML » AI and ML - Conceptual » How to propagate/fire recurrent neural networks(RNN)?

How to propagate/fire recurrent neural networks(RNN)?

  • Bookmark this question.
     

    I'm learning about artificial neural networks and have implemented a standard feed-forward net with a couple hidden layers. Now, I'm trying to understand how a recurrent neural network(RNN) works in practice, and am having trouble with how activation/propagation flows through the network.

    In my feed-forward, the activation is a simple layer by layer firing of the neurons. In a recurrent net, the neurons connect back to previous layers and sometimes themselves, so the way to propagate the network must be different. Trouble is, I can't seem to find an explanation of exactly how the propagation happens.

    How might it occur say for a network like this:

    Input1 --->Neuron A1 --------->  Neuron B1 ---------------------> Output
                    ^                   ^     ^      |
                    |                   |     --------
                    |                   |
    Input2 --->Neuron A2 --------->  Neuron B2
    

    I imagined it would be a rolling activation with a gradual die down as the neuron's thresholds decrease the neuron firing to 0, much like in biology, but it appears there is a much more computational efficient way to do this through derivatives?

      November 2, 2021 7:43 PM IST
    0
  • I think I have a grasp now on the basic principle of propagating recurrent versus feed-forward networks: an explicit time step.

    In a feed-forward, the propagation happens layer by layer, so Layer 1 neurons fire first, followed by Layers 2, 3 etc, so the propagation is one neuron activation stimulating activation in the neurons that take it as input.

    Alternatively, we can think of propagation instead as the neurons whose inputs are active at any given point in time are the ones to fire. So if we have a time t=0 were Layer 1 neurons are active, at the next time t=1 the next layer Layer 2 will activate, since the neurons in Layer 2 take the neurons in Layer 1 as input.

    While the difference in thinking may seem like semantics, for me it was crucial in figuring out how to implement recurrent networks. In the feed-forward the time step is implicit, and the code passes over the neuron layers in turn, activating them like falling dominoes. In a recurrent network, trying the falling-domino way of activation where every neuron specifies what neuron it activates next would be a nightmare for large, convoluted networks. Instead, it makes sense to poll very neuron in the network at a give time t, to see if it activates based on its inputs.

    There are of course many different types of recurrent neural network, but I think it is this crucial explicit time step that is the key to recurrent network propagation.

    The differential equations part I was wondering about comes in to play if instead of having discrete time steps of t be 0, 1, 2, etc., to try and have smoother, more continuous network flow by modeling the propagation over very small time increments, like 0.2, 0.1, 0.05, etc.

      November 3, 2021 1:55 PM IST
    0
  • During forward propagation, a series of calculations are performed to generate a prediction and to calculate the cost. The cost is a function that we wish to minimize. Then, backpropagation calculates the gradient or the derivatives. This will be useful during the optimization phase because when the derivatives are close or equal to 0, it means that our parameters are optimized to minimize the cost function.

     

     

    In a feed-forward, the propagation happens layer by layer, therefore Layer 1 neuron fire first, followed by Layers 2, 3, etc, so the propagation is one neuron activation stimulating activation in the neurons that take it as input.

     

    Alternatively, we can think of propagation instead as the neurons whose inputs are active at any given point in time are the ones to fire. therefore if we have got a time t=0 were Layer 1 neurons are active, at the next time t=1 the next layer Layer 2 will activate since the neurons in Layer 2 take the neurons in Layer 1 as input.

      November 8, 2021 2:50 PM IST
    0
  • The input signal s(t) is given for the different time steps t=t0, t1, t2...tN. In a recurrent layer the inputs are coming from the input signal as well as the state of the network, which is the excitation level from the previous time step. So you must update the internal state from the input signal and the previous internal state (excitation level of recurrent neurons).
      November 8, 2021 5:15 PM IST
    0