Robotics And Machine ANalytics Laboratory (RAMAN LAB) Malaviya National Institute of Technology Jaipur

Overview of Recurrent Neural Network (RNN)

The recurrent neural network is an architecture of neural nets which is the role model to process sequential data and are used by many top leading companies in their AI division such as Apple’s SIRI and google voice, Baidu Text to Speech(TTS) for processing sequential data. This is the first algorithm that remembers its input, due to its internal memory and outputs result based on all of the inputs, such as telling a story just by looking at a picture or understanding a video clip.

Need for RNN network?
Before diving into details how RNN works, let’ see what an RNN network is capable of doing. RNN has a wide variety of application from which few are listed below:

Sentiment Classification: A RNN network can predict the sentiment of a statement. This is done by feeding single word at a time to the model while the model process the new input along with the previous inputs making a sense out of the statement.

Image Captioning – RNN network is capable of captioning an image based on the elements present in it. The algorithm identifies all the elements in the picture individually, saving all the outputs in the memory and making a sensible statement out of all the resultant words.

Question Answering: RNN network is capable of understanding the input question from the questionnaire and responding with an appropriate answer based on the knowledge of the model.

How RNN works?

In the above figure
x(i)– represent the new input

y(i) – represent the output

h(i-1) -represent the hidden state input of previous node i.e. x(i-1)

In RNN, every node receives two input, one x which is the new input and h which is the hidden state of the previous node. Every node predicts output using the two inputs.

In short, RNN

takes a sequence of input (length>=1)

keep applying the same set of operations on each of the input along the sequence

carry the internal state to represent or remember the underlying patterns/relations/states in the sequence

generates a sequence of outputs (length>=1)

The goal is to use y(t) as output and compare it to our test data (which is usually a small subset of the original data). We will then get our error rate. After comparing our output to our test data, with the error rate in hand, we can use a technique called Back Propagation Through Time (BPTT). BPTT back checks through the network and adjusts the weights based on your error rate. This adjusts the network and makes it learn to do better.

Drawback of RNN
Theoretically, RNNs can handle context from the beginning of the sentence which will allow more accurate predictions of a word at the end of a sentence. In practice, this isn’t necessarily true for RNNs. This is a major reason why RNNs faded out from practice for a while until some great results were achieved by using a Long Short-Term Memory(LSTM) unit inside the Neural Network.

We will be discussing LSTM, which is a more efficient and robust version of RNN in our future articles.



Go to Top