Recurrent Neural Networks

  • Used for time series / sequential data

  • Long sequences

Derivation of gradient updates for simple RNN

LSTM

Forget Gate

After getting the precious h_{t-1} and the input x_t. Decide how much to forget

\[\]