Introduction to Recurrent Neural Networks

Kinder Chen
2 min readMar 29, 2021

Recurrent Neural Networks (RNNs) are a special type of neural networks designed for sequence problems. RNNs add the explicit handling of order between observations when approximating a mapping function from input variables to output variables, which is capable to predict time series for neural networks. Traditional time series forecasting methods like ARIMA focus on univariate data with linear relationships. However, RNNs add the capability to learn possibly noisy and nonlinear relationships and provide direct support for multivariate and multi-step forecasting. This blog gives a brief introduction of the mechanism of RNN.

Sequences

The trait of RNN is to evaluate sequences of data, rather than just individual data. A classic example of sequence data is time series data, which any given point in time can only be examined relative to the previous points of time in that sequence. Another great example of sequence data is text. All text data is sequence data by default. Because a letter only makes sense when it’s words are in the proper order.

RNN Architecture

A basic RNN is just a neural network that passes its output from a given example back into itself as input for the next example. h(t) represents the model’s output for time step t. The current time step is denoted with the input node X(t). The diagram shows an unrolled representation, which represents the components at each given timestep. The previous output will be passed back into the model to evaluate the next time step.

The main technique for training feedforward neural networks is to back propagate error and update the network weights. However, back propagation breaks down in a RNN due to the recurrent or loop connections. RNNs use a modified form of back propagation called back propagation through time (BPTT) to address the hurdle. RNNs need adjust the model’s weights at each time point to effectively learn from sequence data. The model starts at the most recent output, and then works backwards to calculate the loss and update the weights at each time step. BPTT is much more computationally expensive than vanilla back propagation due to the stepwise updating. Truncated back propagation through time (TBPTT) algorithm can increase performance by breaking a big sequence of points into small sequences, which significantly improves training time over BPTT.

--

--

Kinder Chen

What happened couldn’t have happened any other way…