Recurrent Neural Networks (RNNs) with TensorFlow in Python
- Aug 17, 2024
- 7 min read
Updated: 6 days ago
Recurrent Neural Networks (RNNs) are a powerful class of neural networks designed for sequential data, making them ideal for tasks where the order of inputs matters, such as time series prediction, natural language processing, and speech recognition. Unlike traditional feedforward neural networks, RNNs have connections that form cycles within the network, allowing information to persist and be passed through from one step to the next.
In this blog, we'll explore how to implement an RNN using TensorFlow in Python. We'll cover the basics of RNNs, their architecture, and how to apply them to a simple sequence prediction task.

What are Recurrent Neural Networks (RNN's)?
A Recurrent Neural Network (RNN) is a type of artificial neural network specifically designed to handle sequential data, where the order of the data points is crucial. Unlike traditional feedforward neural networks, which assume all inputs are independent of each other, RNNs have a unique architecture that incorporates loops, allowing information to persist across different time steps. This looping mechanism enables the network to maintain a hidden state that captures the essence of the input sequence as it progresses, effectively creating a memory of previous inputs. This memory is critical for tasks like time series prediction, natural language processing, and speech recognition, where the context provided by prior inputs significantly influences the current output.
However, RNNs face challenges, particularly with long sequences, due to issues like vanishing and exploding gradients, which can make training difficult. To address these challenges, variants of RNNs, such as Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs), have been developed, which include mechanisms to better manage the flow of information over extended sequences.
RNNs have evolved to address some of their inherent limitations, particularly with long sequences, leading to the development of several variants that enhance their capabilities. The most common RNN variants are:
Vanilla RNN: This is the basic form of an RNN, where each neuron in the hidden layer has a connection to itself, allowing it to retain information over time. However, Vanilla RNNs often struggle with learning long-term dependencies due to issues like vanishing and exploding gradients, making them less effective for long sequences.
Long Short-Term Memory (LSTM): LSTMs are designed to overcome the limitations of Vanilla RNNs by introducing a more complex architecture that includes three types of gates: input, output, and forget gates. These gates regulate the flow of information into and out of the cell state, effectively allowing the network to retain or forget information over longer periods. LSTMs are highly effective in tasks that require learning long-term dependencies, such as language translation, speech recognition, and time series forecasting.
Gated Recurrent Unit (GRU): GRUs are a simplified version of LSTMs that combine the input and forget gates into a single update gate, and merge the cell state and hidden state. This streamlined architecture reduces the number of parameters and computational complexity while still addressing the vanishing gradient problem. GRUs are often preferred in scenarios where computational efficiency is crucial, and they have shown comparable performance to LSTMs in many tasks.
Each of these RNN variants has its strengths and is chosen based on the specific requirements of the task at hand, whether it involves short or long sequences, computational constraints, or the complexity of the data being processed.
Implementing Recurrent Neural Networks (RNN's) with TensorFlow in Python
To better understand how Recurrent Neural Networks work in practice, it helps to build a simple model from scratch. In this example, we will implement an RNN using TensorFlow in Python to perform a basic sequence prediction task. The goal of the model will be to learn patterns within a sequence of numbers and predict the next value in that sequence.
This type of problem highlights the strength of RNNs, since these networks are specifically designed to process sequential data and remember information from previous time steps.
Step 1: Importing the Necessary Libraries
Before building the model, the required libraries must be imported. TensorFlow provides the core deep learning framework, while Keras simplifies the process of building neural network architectures. NumPy is used to handle numerical operations and prepare the dataset.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense
import numpy as npIn this setup, the Sequential model from Keras allows us to stack layers easily, while the SimpleRNN layer enables the network to process sequential data. The Dense layer will later be used to generate the final prediction output.
Step 2: Preparing the Dataset
For demonstration purposes, we will generate a simple numerical sequence. The model will be trained to analyze a fixed number of previous values and predict the next number in the sequence. This approach allows us to simulate how RNNs learn patterns over time.
First, we create a helper function that converts a sequence into multiple input-output pairs suitable for training the neural network.
# Generate a simple dataset
def create_dataset(sequence, n_steps):
X, y = [], []
for i in range(len(sequence)):
end_ix = i + n_steps
if end_ix > len(sequence) - 1:
break
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)This function splits the original sequence into smaller overlapping sequences. Each input sequence contains a fixed number of time steps, while the output represents the value the model should predict next.
Next, we generate a simple dataset and prepare it for training.
# Example sequence
sequence = np.array([i for i in range(10)])
# Prepare the input-output pairs
n_steps = 3
X, y = create_dataset(sequence, n_steps)Here, the model will look at three numbers at a time and learn to predict the number that follows.
Finally, the input data must be reshaped to match the format expected by RNN layers. Recurrent Neural Networks require a three-dimensional input structure consisting of samples, time steps, and features.
# Reshape input to be [samples, time steps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))This step ensures the dataset is properly structured so that the RNN can interpret each sequence correctly during training.
Step 3: Building the RNN Model
Now that the dataset has been prepared, the next step is to construct the Recurrent Neural Network. In this example, we use a Sequential model, which allows layers to be stacked in a linear order. The architecture begins with a SimpleRNN layer that processes sequential input and learns patterns across time steps. This layer is responsible for remembering information from previous elements in the sequence and using it to influence future predictions.
The RNN layer is configured with 50 neurons and uses the ReLU activation function to introduce non-linearity into the model. After the recurrent layer processes the sequence data, a Dense layer is added to produce the final output. Since the task involves predicting a single numeric value, the Dense layer contains just one neuron.
Once the architecture is defined, the model is compiled using the Adam optimizer, which helps efficiently adjust the network’s weights during training. The mean squared error (MSE) loss function is used because this is a regression problem where the model predicts continuous numerical values.
Finally, the model summary is printed to review the structure and verify that the layers are configured correctly before training begins.
# Define the RNN model
model = Sequential()
model.add(SimpleRNN(50, activation='relu',
input_shape=(n_steps, 1)))
model.add(Dense(1))
# Compile the model
model.compile(optimizer='adam', loss='mse')
# Summarize the model
model.summary()The model summary displays the layers, output shapes, and the number of trainable parameters, giving a quick overview of the network architecture before moving on to training.

Step 4: Training the Recurrent Neural Network Model
Once the RNN architecture has been defined and compiled, the next step is to train the model using the prepared dataset. During training, the network analyzes the input sequences and gradually adjusts its internal weights to minimize prediction errors.
In this example, the model is trained for 200 epochs, meaning the entire dataset is passed through the network 200 times. With each iteration, the optimizer updates the model’s parameters based on the calculated loss. Because this is a sequence prediction task, the mean squared error (MSE) loss function measures how far the predicted values are from the actual target values.
As training progresses, the loss should gradually decrease, indicating that the RNN is learning temporal relationships within the sequence data. The verbose=1 setting displays detailed progress during training, allowing us to observe how the model improves over time.
# Train the model
model.fit(X, y, epochs=200, verbose=1)By the end of the training process, the network should have learned the underlying pattern in the dataset and be able to predict the next value in a sequence with reasonable accuracy.
Output for the above code:
Epoch 184/200
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 0.3377
.
.
.
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 0.3043Step 5: Making Predictions
After the model has been trained, the next step is to evaluate its ability to make predictions on new data. This is done by providing a sequence that follows the same structure as the training inputs. The model analyzes this sequence and produces an estimated next value based on the patterns it learned during training.
To perform the prediction, we first create a new input sequence. Because RNN layers expect a three-dimensional input format consisting of samples, time steps, and features, the sequence must be reshaped to match the structure used during training. Once the input is correctly formatted, the trained model can generate a prediction.
# Demonstrate prediction
x_input = np.array([7, 8, 9])
x_input = x_input.reshape((1, n_steps, 1))
yhat = model.predict(x_input, verbose=0)
print(f"Predicted next value: {yhat[0][0]}")In this example, the input sequence [7, 8, 9] is provided to the model. Since the training data followed a simple incremental pattern, the network predicts the next value in the sequence. The output produced by the model might look similar to the following:
Predicted next value: 10.705029487609863The prediction is close to the expected value of 10, which shows that the Recurrent Neural Network successfully captured the underlying pattern in the dataset. Small differences occur because neural networks approximate patterns rather than calculating exact mathematical rules.
This simple example demonstrates how RNNs can learn sequential relationships and generate predictions based on previously observed patterns. In real-world applications, the same concept can be applied to far more complex tasks such as language modeling, speech recognition, and time-series forecasting.
Conclusion
Recurrent Neural Networks (RNNs) play a crucial role in deep learning, particularly when working with sequential data. Unlike traditional neural networks, RNNs are designed to retain information from previous inputs, allowing them to recognize patterns that unfold over time. This capability makes them highly effective for applications such as time series forecasting, language modeling, speech recognition, and other sequence-based tasks.
Despite their strengths, standard RNNs can struggle with long-term dependencies during training. To address this limitation, more advanced architectures such as Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs) were developed. These variants improve the model’s ability to capture long-range patterns by managing how information flows through the network.
Frameworks like TensorFlow make it significantly easier to design, train, and experiment with these models. By combining powerful computational capabilities with accessible APIs, TensorFlow enables developers and researchers to build sophisticated deep learning systems using Python.





