Long Short Term Memory(LSTM) is a special type of Recurrent Neural Network(RNN) which can retain important information over time using memory cells.

This property of LSTMs makes it a wonderful algorithm to learn sequences that are interdependent and can help to build solutions like language translation, sales time series, chatbots, autocorrections, next word suggestions, etc.

You can read more about LSTMs here.

In this case study, I will show how LSTMs can be used to learn the patterns in the stock prices. Using this template you will be able to predict tomorrow’s price of a stock based on the last 10 days prices.

**Pulling historical stock prices data**

To pull the data for any stock we can use a library named ‘**nsepy**‘

If you want to do this for any other stock, just use the stock market ticker symbol for that company like I have used ‘INFY’ for Infosys here.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import pandas as pd import numpy as np # To remove the scientific notation from numpy arrays np.set_printoptions(suppress=True) # install the nsepy library to get stock prices !pip install nsepy ############################################ # Getting Stock data using nsepy library from nsepy import get_history from datetime import datetime startDate=datetime(2019, 1,1) endDate=datetime(2020, 10, 5) # Fetching the data StockData=get_history(symbol='INFY', start=startDate, end=endDate) print(StockData.shape) StockData.head() |

**Visualizing the stock prices movement**

1 2 3 4 5 6 |
# Creating a column as date StockData['TradeDate']=StockData.index # Plotting the stock prices %matplotlib inline StockData.plot(x='TradeDate', y='Close', kind='line', figsize=(20,6), rot=20) |

**Preparing the data **

The LSTM model will need data input in the form of X Vs y. Where the X will represent the last 10 day’s prices and y will represent the 11th-day price.

By looking at a lot of such examples from the past 2 years, the LSTM will be able to learn the movement of prices. Hence, when we pass the last 10 days of the price it will be able to predict tomorrow’s stock close price.

Since LSTM is a Neural network-based algorithm, standardizing or normalizing the data is mandatory for a fast and more accurate fit.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Extracting the closing prices of each day FullData=StockData[['Close']].values print(FullData[0:5]) # Feature Scaling for fast training of neural networks from sklearn.preprocessing import StandardScaler, MinMaxScaler # Choosing between Standardization or normalization #sc = StandardScaler() sc=MinMaxScaler() DataScaler = sc.fit(FullData) X=DataScaler.transform(FullData) #X=FullData print('### After Normalization ###') X[0:5] |

**Preparing the data for LSTM**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# split into samples X_samples = list() y_samples = list() NumerOfRows = len(X) TimeSteps=10 # next day's Price Prediction is based on last how many past day's prices # Iterate thru the values to create combinations for i in range(TimeSteps , NumerOfRows , 1): x_sample = X[i-TimeSteps:i] y_sample = X[i] X_samples.append(x_sample) y_samples.append(y_sample) ################################################ # Reshape the Input as a 3D (number of samples, Time Steps, Features) X_data=np.array(X_samples) X_data=X_data.reshape(X_data.shape[0],X_data.shape[1], 1) print('\n#### Input Data shape ####') print(X_data.shape) # We do not reshape y as a 3D data as it is supposed to be a single column only y_data=np.array(y_samples) y_data=y_data.reshape(y_data.shape[0], 1) print('\n#### Output Data shape ####') print(y_data.shape) |

**Splitting the data into training and testing**

Keeping last few days of data to test the learnings of the model and rest for training the model.

Here I am choosing Last 5 days as testing.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# Choosing the number of testing data records TestingRecords=5 # Splitting the data into train and test X_train=X_data[:-TestingRecords] X_test=X_data[-TestingRecords:] y_train=y_data[:-TestingRecords] y_test=y_data[-TestingRecords:] ############################################ # Printing the shape of training and testing print('\n#### Training Data shape ####') print(X_train.shape) print(y_train.shape) print('\n#### Testing Data shape ####') print(X_test.shape) print(y_test.shape) |

**Visualizing the input and output data for LSTM**

Printing some sample input and output values to help you visualize how the LSTM model will learn the prices.

You can see the input is a 3D array of the last 10 prices and the output is a 1D array of the next price.

1 2 3 |
# Visualizing the input and output being sent to the LSTM model for inp, out in zip(X_train[0:2], y_train[0:2]): print(inp,'--', out) |

**Creating the Deep Learning LSTM model**

Look at the use of the LSTM function instead of Dense to define the hidden layers. The output layer has one neuron as we are predicting the next day price, if you want to predict for multiple days, then change the input data and neurons equal to the number of days of forecast.

1 2 3 4 5 |
# Defining Input shapes for LSTM TimeSteps=X_train.shape[1] TotalFeatures=X_train.shape[2] print("Number of TimeSteps:", TimeSteps) print("Number of Features:", TotalFeatures) |

In the below code snippet I have used three hidden LSTM layers and one output layer. You can choose more layers if you don’t get accuracy for your data. Similarly you can increase or decrease the number of neurons in the hidden layer.

Just keep in mind, the more neurons and layers you use, the slower the model becomes. Because now there are many more computations to be done.

Each layer has some hyperparameters which needs to be tuned.

Take a look at some of the important hyperparameters of LSTM below

**units**=10: This means we are creating a layer with ten neurons in it. Each of these five neurons will be receiving the values of inputs.**input_shape = (TimeSteps, TotalFeatures)**: The input expected by LSTM is in 3D format. Our training data has a shape of (420, 10, 1) this is in the form of (number of samples, time steps, number of features). This means we have 420 examples to learn in training data, each example looks back 10-steps in time like what was the stock price yesterday, the day before yesterday so on till last 10 days. This is known as Time steps. The last number ‘1’ represents the number of features. Here we are using just one column ‘Closing Stock Price’ hence its equal to ‘1’**kernel_initializer=â€™uniformâ€™**: When the Neurons start their computation, some algorithm has to decide the value for each weight. This parameter specifies that. You can choose different values for it like â€˜normalâ€™ or â€˜glorot_uniformâ€™.**activation=â€™reluâ€™**: This specifies the activation function for the calculations inside each neuron. You can choose values like â€˜reluâ€™, â€˜tanhâ€™, â€˜sigmoidâ€™, etc.**return_sequences=True:**LSTMs backpropagate thru time, hence they return the values of the output from each time step to the next hidden layer. This keeps the expected input of the next hidden layer in the 3D format. This parameter is False for the last hidden layer because now it does not have to return a 3D output to the final Dense layer.**optimizer=â€™adamâ€™:**This parameter helps to find the optimum values of each weight in the neural network. â€˜adamâ€™ is one of the most useful optimizers, another one is â€˜rmspropâ€™**batch_size=10**: This specifies how many rows will be passed to the Network in one go after which the SSE calculation will begin and the neural network will start adjusting its weights based on the errors.

When all the rows are passed in the batches of 10 rows each as specified in this parameter, then we call that 1-epoch. Or one full data cycle. This is also known as mini-batch gradient descent. A small value of batch_size will make the LSTM look at the data slowly, like 2 rows at a time or 4 rows at a time which could lead to overfitting, as compared to a large value like 20 or 50 rows at a time, which will make the LSTM look at the data fast which could lead to underfitting. Hence a proper value must be chosen using hyperparameter tuning.**Epochs=10**: The same activity of adjusting weights continues for 10 times, as specified by this parameter. In simple terms, the LSTM looks at the full training data 10 times and adjusts its weights.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# Importing the Keras libraries and packages from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM # Initialising the RNN regressor = Sequential() # Adding the First input hidden layer and the LSTM layer # return_sequences = True, means the output of every time step to be shared with hidden next layer regressor.add(LSTM(units = 10, activation = 'relu', input_shape = (TimeSteps, TotalFeatures), return_sequences=True)) # Adding the Second Second hidden layer and the LSTM layer regressor.add(LSTM(units = 5, activation = 'relu', input_shape = (TimeSteps, TotalFeatures), return_sequences=True)) # Adding the Second Third hidden layer and the LSTM layer regressor.add(LSTM(units = 5, activation = 'relu', return_sequences=False )) # Adding the output layer regressor.add(Dense(units = 1)) # Compiling the RNN regressor.compile(optimizer = 'adam', loss = 'mean_squared_error') ################################################## import time # Measuring the time taken by the model to train StartTime=time.time() # Fitting the RNN to the Training set regressor.fit(X_train, y_train, batch_size = 5, epochs = 100) EndTime=time.time() print("## Total Time Taken: ", round((EndTime-StartTime)/60), 'Minutes ##') |

**Measuring the accuracy of the model on testing data**

Now using the trained model, we are checking if the predicted prices for the last 5 days are close to the actual prices or not.

Notice the inverse transform of the predictions. Since we normalized the data before the model training, the predictions on testing data will also be normalized, hence the inverse transformation will bring the values to the original scale. Then only we must calculate the percentage accuracy.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# Making predictions on test data predicted_Price = regressor.predict(X_test) predicted_Price = DataScaler.inverse_transform(predicted_Price) # Getting the original price values for testing data orig=y_test orig=DataScaler.inverse_transform(y_test) # Accuracy of the predictions print('Accuracy:', 100 - (100*(abs(orig-predicted_Price)/orig)).mean()) # Visualising the results import matplotlib.pyplot as plt plt.plot(predicted_Price, color = 'blue', label = 'Predicted Volume') plt.plot(orig, color = 'lightblue', label = 'Original Volume') plt.title('Stock Price Predictions') plt.xlabel('Trading Date') plt.xticks(range(TestingRecords), StockData.tail(TestingRecords)['TradeDate']) plt.ylabel('Stock Price') plt.legend() fig=plt.gcf() fig.set_figwidth(20) fig.set_figheight(6) plt.show() |

**Visualizing the predictions for full data**

Plotting the training and testing data both to see how good the LSTM model has fitted.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# Generating predictions on full data TrainPredictions=DataScaler.inverse_transform(regressor.predict(X_train)) TestPredictions=DataScaler.inverse_transform(regressor.predict(X_test)) FullDataPredictions=np.append(TrainPredictions, TestPredictions) FullDataOrig=FullData[TimeSteps:] # plotting the full data plt.plot(FullDataPredictions, color = 'blue', label = 'Predicted Price') plt.plot(FullDataOrig , color = 'lightblue', label = 'Original Price') plt.title('Stock Price Predictions') plt.xlabel('Trading Date') plt.ylabel('Stock Price') plt.legend() fig=plt.gcf() fig.set_figwidth(20) fig.set_figheight(8) plt.show() |

**How to predict the stock price for tomorrow**

If you want to predict the price for tomorrow, all you have to do is to pass the last 10 day’s prices to the model in 3D format as it was used in the training.

The below snippet shows you how to take the last 10 prices manually and do a single prediction for the next price.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# Last 10 days prices Last10Days=np.array([1002.15, 1009.9, 1007.5, 1019.75, 975.4, 1011.45, 1010.4, 1009,1008.25, 1017.65]) # Normalizing the data just like we did for training the model Last10Days=DataScaler.transform(Last10Days.reshape(-1,1)) # Changing the shape of the data to 3D # Choosing TimeSteps as 10 because we have used the same for training NumSamples=1 TimeSteps=10 NumFeatures=1 Last10Days=Last10Days.reshape(NumSamples,TimeSteps,NumFeatures) ############################# # Making predictions on data predicted_Price = regressor.predict(Last10Days) predicted_Price = DataScaler.inverse_transform(predicted_Price) predicted_Price |

I have used the actual prices from the stock market till last friday in the above snippet!

So we see the model predicts the next closing price of **Infosys** fo today(5th-Oct-2020) is **Rs** **1023**! I checked and found the closing price for the day was **Rs 1048**! not bad for such little effort! ðŸ™‚

**What If I want to predict prices for the next 5 days?**

The model which we have built above uses the last 10 days prices and predicts the next day’s price because we have trained our model with many past examples of the same granularity as shown below

*last 10 days prices–> 11th day price*

Now if we want to predict the next 5 days or next 20 days prices, then we need to train the model with similar examples from the past like shown below

*last 10 days prices–> Next 5 days price*s

This is also known as **Multi-Step** time series prediction, where we predict multiple time steps ahead.

To achieve this, it will require small modifications in the data preparation step and in the LSTM model both.

However, keep in mind, the further you predict, the lesser accurate you might be, because stock prices are volatile and no one can know what is going to happen after 10 days! What kind of news will come? which might affect the prices of this stock!

Hence, it is recommended to predict for as less time steps as possible, for example next 2 days or next 5 days at max.

**Data Preparation for Multi Step LSTM**

I am showing how to prepare the data for predicting next 5 days. The same code can be easily modified to predict next 10 days or 20 days as well.

1 2 3 4 5 6 7 8 9 10 11 12 13 |
# Considering the Full Data again which we extracted above # Printing the last 10 values print('Original Prices') print(FullData[-10:]) print('###################') # Printing last 10 values of the scaled data which we have created above for the last model # Here I am changing the shape of the data to one dimensional array because # for Multi step data preparation we need to X input in this fashion X=X.reshape(X.shape[0],) print('Scaled Prices') print(X[-10:]) |

I have modified the data split logic from the last model to produce the **input–>output** pairs by defining **FutureTimeSteps=5**. This determines we want to predict the next 5 days’ prices based on the last 10 days.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
# Multi step data preparation # split into samples X_samples = list() y_samples = list() NumerOfRows = len(X) TimeSteps=10 # next few day's Price Prediction is based on last how many past day's prices FutureTimeSteps=5 # How many days in future you want to predict the prices # Iterate thru the values to create combinations for i in range(TimeSteps , NumerOfRows-FutureTimeSteps , 1): x_sample = X[i-TimeSteps:i] y_sample = X[i:i+FutureTimeSteps] X_samples.append(x_sample) y_samples.append(y_sample) ################################################ # Reshape the Input as a 3D (samples, Time Steps, Features) X_data=np.array(X_samples) X_data=X_data.reshape(X_data.shape[0],X_data.shape[1], 1) print('### Input Data Shape ###') print(X_data.shape) # We do not reshape y as a 3D data as it is supposed to be a single column only y_data=np.array(y_samples) print('### Output Data Shape ###') print(y_data.shape) |

**Splitting the data into Training and Testing**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# Choosing the number of testing data records TestingRecords=5 # Splitting the data into train and test X_train=X_data[:-TestingRecords] X_test=X_data[-TestingRecords:] y_train=y_data[:-TestingRecords] y_test=y_data[-TestingRecords:] ############################################# # Printing the shape of training and testing print('\n#### Training Data shape ####') print(X_train.shape) print(y_train.shape) print('\n#### Testing Data shape ####') print(X_test.shape) print(y_test.shape) |

**Visualizing the input->output sent to LSTM Multi-step model**

Printing some records of input and output always helps to understand the process in a LSTM model.

You can see here the input is a 3D array of the last 10 days’ prices and the output is an array of the next 5 days’ prices.

1 2 3 4 5 6 7 |
# Visualizing the input and output being sent to the LSTM model # Based on last 10 days prices we are learning the next 5 days of prices for inp, out in zip(X_train[0:2], y_train[0:2]): print(inp) print('====>') print(out) print('#'*20) |

**Creating the Deep Learning Multi-Step LSTM model**

I am using the same configurations as used in the last model. The change is done at the **Dense** layer. Now the dense layer outputs the number of values equal to the FutureTimeSteps. Which is 5 in this case since we want to predict the next 5 days.

You can change this to 10 to 20 if you want to predict for more days, but you need to prepare the data in the same manner before running this code.

1 2 3 4 5 |
# Defining Input shapes for LSTM TimeSteps=X_train.shape[1] TotalFeatures=X_train.shape[2] print("Number of TimeSteps:", TimeSteps) print("Number of Features:", TotalFeatures) |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# Importing the Keras libraries and packages from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM # Initialising the RNN regressor = Sequential() # Adding the First input hidden layer and the LSTM layer # return_sequences = True, means the output of every time step to be shared with hidden next layer regressor.add(LSTM(units = 10, activation = 'relu', input_shape = (TimeSteps, TotalFeatures), return_sequences=True)) # Adding the Second hidden layer and the LSTM layer regressor.add(LSTM(units = 5, activation = 'relu', input_shape = (TimeSteps, TotalFeatures), return_sequences=True)) # Adding the Third hidden layer and the LSTM layer regressor.add(LSTM(units = 5, activation = 'relu', return_sequences=False )) # Adding the output layer # Notice the number of neurons in the dense layer is now the number of future time steps # Based on the number of future days we want to predict regressor.add(Dense(units = FutureTimeSteps)) # Compiling the RNN regressor.compile(optimizer = 'adam', loss = 'mean_squared_error') ################################################################### import time # Measuring the time taken by the model to train StartTime=time.time() # Fitting the RNN to the Training set regressor.fit(X_train, y_train, batch_size = 5, epochs = 100) EndTime=time.time() print("############### Total Time Taken: ", round((EndTime-StartTime)/60), 'Minutes #############') |

**Measuring the Accuracy of the model on testing data**

Since this is Multi step model trained to predict next 5 days. Each prediction will generate 5 days’ prices which we can match with the original prices.

We will compare them one row at a time.

1 2 3 4 5 6 7 8 9 10 11 |
# Making predictions on test data predicted_Price = regressor.predict(X_test) predicted_Price = DataScaler.inverse_transform(predicted_Price) print('#### Predicted Prices ####') print(predicted_Price) # Getting the original price values for testing data orig=y_test orig=DataScaler.inverse_transform(y_test) print('\n#### Original Prices ####') print(orig) |

Each row represents the original prices and the predicted prices.

We will compare one row at a time. Using a simple for-loop, each row of original values are compared with the predicted values

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
import matplotlib.pyplot as plt for i in range(len(orig)): Prediction=predicted_Price[i] Original=orig[i] # Visualising the results plt.plot(Prediction, color = 'blue', label = 'Predicted Volume') plt.plot(Original, color = 'lightblue', label = 'Original Volume') plt.title('### Accuracy of the predictions:'+ str(100 - (100*(abs(Original-Prediction)/Original)).mean().round(2))+'% ###') plt.xlabel('Trading Date') startDateIndex=(FutureTimeSteps*TestingRecords)-FutureTimeSteps*(i+1) endDateIndex=(FutureTimeSteps*TestingRecords)-FutureTimeSteps*(i+1) + FutureTimeSteps TotalRows=StockData.shape[0] plt.xticks(range(FutureTimeSteps), StockData.iloc[TotalRows-endDateIndex : TotalRows-(startDateIndex) , :]['TradeDate']) plt.ylabel('Stock Price') plt.legend() fig=plt.gcf() fig.set_figwidth(20) fig.set_figheight(3) plt.show() |

**Making predictions for the next 5 days**

If you want to predict the price for the next 5 days, all you have to do is to pass the last 10 day’s prices to the model in 3D format as it was used in the training.

The below snippet shows you how to pass the last 10 values manually to get the next 5 days’ price predictions.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# Making predictions on test data Last10DaysPrices=array([1376.2, 1371.75,1387.15,1370.5 ,1344.95, 1312.05, 1316.65, 1339.45, 1339.7 ,1340.85]) # Reshaping the data to (-1,1 )because its a single entry Last10DaysPrices=Last10DaysPrices.reshape(-1, 1) # Scaling the data on the same level on which model was trained X_test=DataScaler.transform(Last10DaysPrices) NumberofSamples=1 TimeSteps=X_test.shape[0] NumberofFeatures=X_test.shape[1] # Reshaping the data as 3D input X_test=X_test.reshape(NumberofSamples,TimeSteps,NumberofFeatures) # Generating the predictions for next 5 days Next5DaysPrice = regressor.predict(X_test) # Generating the prices in original scale Next5DaysPrice = DataScaler.inverse_transform(Next5DaysPrice) Next5DaysPrice |

**Conclusion**

This prediction is only short-term. The moment you try to predict for multiple days like the next 30-days or 60 days, this fails miserably. Not because our LSTM model is bad, but, because Stock markets are highly volatile. So, don’t bet your money simply on this model! Do some research and then use this model as a complementary tool for analysis!

I hope you enjoyed reading this post and it helped you clear some of your doubts regarding LSTM models. Consider sharing this post with your friends to help spread the knowledge ðŸ™‚

patickyuthank you so much for your code

I just have question. What if i want to predict next 5 days price.

how can I change your code?

Farukh HashmiI have updated the post for multiple days price predictions!

RobWhen using the previous 10 days to predict 5 days in the future, does the program use a “sliding window” to make its predictions? In other words, the 11th day knows about the previous 10 days, but does the 12th day know about the prediction made for the 11th day?

Nathaniel KrenThank you kindly for sharing your code! I used your model as a basis for my Capstone project in Computer Science. On that note, when was this post published? I want to properly reference your work in my paper.

Farukh HashmiHi Nathaniel

I am very happy to see that this post helped you in your project. It was published on 5th-Oct-2020. You can reference it with this date.

RobHi,

When predicting the next 5 days based on the previous 10 days, does the program use a sliding window? In other words, the 11th day makes a prediction based on the previous 10 days, but does the 12th day know about the result from the 11th day before it makes its prediction?

Farukh HashmiHi Rob!

Yes your observation is correct. If you look at the LSTM diagram. You can see that the next time step(12th day) knows what was the prediction for previous time step(11th day). So on and so forth.

SamusI think there is something wrong with that part where you’re predicting prices for the next 5 days.

So you’re using a batch size of 10 to predict a 5 days output.

1st batch:

input: from 9/7/2020 to 9/18/2020

output: from 8/31/2020 to 9/4/2020

2nd batch:

input: from 9/8/2020 to 9/21/2020

output: from 9/7/2020 to 9/11/2020

And so on.

I mean, look at the first batch. Your input is a window from 9/7 to 9/18 to predict prices from 8/31 to 9/4 that would be actually in the past.

Farukh HashmiHi Samus,

The model uses the last 10 days of prices to predict the next 5 days.

Please print the X_test[-5: ] and y_test[-5: ] to check the last five input and output pairs that are fed to the LSTM model. It will make things clear for you.