Predicting stock prices using Deep Learning LSTM model in Python

LSTM architecture

Long Short Term Memory(LSTM) is a special type of Recurrent Neural Network(RNN) which can retain important information over time using memory cells.

This property of LSTMs makes it a wonderful algorithm to learn sequences that are interdependent and can help to build solutions like language translation, sales time series, chatbots, autocorrections, next word suggestions, etc.

You can read more about LSTMs here.

In this case study, I will show how LSTMs can be used to learn the patterns in the stock prices. Using this template you will be able to predict tomorrow’s price of a stock based on the last 10 days prices.


Pulling historical stock prices data

To pull the data for any stock we can use a library named ‘nsepy

If you want to do this for any other stock, just use the stock market ticker symbol for that company like I have used ‘INFY’ for Infosys here.

Historical stock prices for Infosys


Visualizing the stock prices movement

Stock prices of Infosys for last two years to be used in training of LSTM model


Preparing the data

The LSTM model will need data input in the form of X Vs y. Where the X will represent the last 10 day’s prices and y will represent the 11th-day price.

By looking at a lot of such examples from the past 2 years, the LSTM will be able to learn the movement of prices. Hence, when we pass the last 10 days of the price it will be able to predict tomorrow’s stock close price.

Since LSTM is a Neural network-based algorithm, standardizing or normalizing the data is mandatory for a fast and more accurate fit.


Preparing the data for LSTM

Getting data ready for LSTM


Splitting the data into training and testing

Keeping last few days of data to test the learnings of the model and rest for training the model.

Here I am choosing Last 5 days as testing.

Splitting the data into training and testing for LSTM


Visualizing the input and output data for LSTM

Printing some sample input and output values to help you visualize how the LSTM model will learn the prices.

You can see the input is a 3D array of the last 10 prices and the output is a 1D array of the next price.

Sample input and output values for LSTM


Creating the Deep Learning LSTM model

Look at the use of the LSTM function instead of Dense to define the hidden layers. The output layer has one neuron as we are predicting the next day price, if you want to predict for multiple days, then change the input data and neurons equal to the number of days of forecast.

In the below code snippet I have used three hidden LSTM layers and one output layer. You can choose more layers if you don’t get accuracy for your data. Similarly you can increase or decrease the number of neurons in the hidden layer.

Just keep in mind, the more neurons and layers you use, the slower the model becomes. Because now there are many more computations to be done.

Each layer has some hyperparameters which needs to be tuned.

Take a look at some of the important hyperparameters of LSTM below

  • units=10: This means we are creating a layer with ten neurons in it. Each of these five neurons will be receiving the values of inputs.
  • input_shape = (TimeSteps, TotalFeatures): The input expected by LSTM is in 3D format. Our training data has a shape of (420, 10, 1) this is in the form of (number of samples, time steps, number of features). This means we have 420 examples to learn in training data, each example looks back 10-steps in time like what was the stock price yesterday, the day before yesterday so on till last 10 days. This is known as Time steps. The last number ‘1’ represents the number of features. Here we are using just one column ‘Closing Stock Price’ hence its equal to ‘1’
  • kernel_initializer=’uniform’: When the Neurons start their computation, some algorithm has to decide the value for each weight. This parameter specifies that. You can choose different values for it like ‘normal’ or ‘glorot_uniform’.
  • activation=’relu’: This specifies the activation function for the calculations inside each neuron. You can choose values like ‘relu’, ‘tanh’, ‘sigmoid’, etc.
  • return_sequences=True: LSTMs backpropagate thru time, hence they return the values of the output from each time step to the next hidden layer. This keeps the expected input of the next hidden layer in the 3D format. This parameter is False for the last hidden layer because now it does not have to return a 3D output to the final Dense layer.
  • optimizer=’adam’: This parameter helps to find the optimum values of each weight in the neural network. ‘adam’ is one of the most useful optimizers, another one is ‘rmsprop’
  • batch_size=10: This specifies how many rows will be passed to the Network in one go after which the SSE calculation will begin and the neural network will start adjusting its weights based on the errors.
    When all the rows are passed in the batches of 10 rows each as specified in this parameter, then we call that 1-epoch. Or one full data cycle. This is also known as mini-batch gradient descent. A small value of batch_size will make the LSTM look at the data slowly, like 2 rows at a time or 4 rows at a time which could lead to overfitting, as compared to a large value like 20 or 50 rows at a time, which will make the LSTM look at the data fast which could lead to underfitting. Hence a proper value must be chosen using hyperparameter tuning.
  • Epochs=10: The same activity of adjusting weights continues for 10 times, as specified by this parameter. In simple terms, the LSTM looks at the full training data 10 times and adjusts its weights.

Training LSTM model on stock prices data


Measuring the accuracy of the model on testing data

Now using the trained model, we are checking if the predicted prices for the last 5 days are close to the actual prices or not.

Notice the inverse transform of the predictions. Since we normalized the data before the model training, the predictions on testing data will also be normalized, hence the inverse transformation will bring the values to the original scale. Then only we must calculate the percentage accuracy.

Measuring the accuracy of LSTM model on testing data


Visualizing the predictions for full data

Plotting the training and testing data both to see how good the LSTM model has fitted.

LSTM model prediction plot on full data


How to predict the stock price for tomorrow

If you want to predict the price for tomorrow, all you have to do is to pass the last 10 day’s prices to the model in 3D format as it was used in the training.

The below snippet shows you how to take the last 10 prices manually and do a single prediction for the next price.

Making a single prediction with LSTM

I have used the actual prices from the stock market till last friday in the above snippet!

So we see the model predicts the next closing price of Infosys fo today(5th-Oct-2020) is Rs 1023! I checked and found the closing price for the day was Rs 1048! not bad for such little effort! 🙂


What If I want to predict prices for the next 5 days?

The model which we have built above uses the last 10 days prices and predicts the next day’s price because we have trained our model with many past examples of the same granularity as shown below

last 10 days prices–> 11th day price

Now if we want to predict the next 5 days or next 20 days prices, then we need to train the model with similar examples from the past like shown below

last 10 days prices–> Next 5 days prices

This is also known as Multi-Step time series prediction, where we predict multiple time steps ahead.

To achieve this, it will require small modifications in the data preparation step and in the LSTM model both.

However, keep in mind, the further you predict, the lesser accurate you might be, because stock prices are volatile and no one can know what is going to happen after 10 days! What kind of news will come? which might affect the prices of this stock!

Hence, it is recommended to predict for as less time steps as possible, for example next 2 days or next 5 days at max.


Data Preparation for Multi Step LSTM

I am showing how to prepare the data for predicting next 5 days. The same code can be easily modified to predict next 10 days or 20 days as well.

Sample data for LSTM multi step stock prices prediction
Sample data for LSTM multi step stock prices prediction

I have modified the data split logic from the last model to produce the input–>output pairs by defining FutureTimeSteps=5. This determines we want to predict the next 5 days’ prices based on the last 10 days.

Input/output shape for LSTM multi step model


Splitting the data into Training and Testing

Training and Testing Data shape for Multi step LSTM model


Visualizing the input->output sent to LSTM Multi-step model

Printing some records of input and output always helps to understand the process in a LSTM model.

You can see here the input is a 3D array of the last 10 days’ prices and the output is an array of the next 5 days’ prices.

Input vs output model for Multi step LSTM model


Creating the Deep Learning Multi-Step LSTM model

I am using the same configurations as used in the last model. The change is done at the Dense layer. Now the dense layer outputs the number of values equal to the FutureTimeSteps. Which is 5 in this case since we want to predict the next 5 days.

You can change this to 10 to 20 if you want to predict for more days, but you need to prepare the data in the same manner before running this code.

Defining the time steps and features for the LSTM model

Training the multi step LSTM model


Measuring the Accuracy of the model on testing data

Since this is Multi step model trained to predict next 5 days. Each prediction will generate 5 days’ prices which we can match with the original prices.

We will compare them one row at a time.

Original and Predicted prices in LSTM Multi Step model

Each row represents the original prices and the predicted prices.

We will compare one row at a time. Using a simple for-loop, each row of original values are compared with the predicted values

LSTM predictions part-1

LSTM predictions Part-2

LSTM  Multi step predictions part-3


Making predictions for the next 5 days

If you want to predict the price for the next 5 days, all you have to do is to pass the last 10 day’s prices to the model in 3D format as it was used in the training.

The below snippet shows you how to pass the last 10 values manually to get the next 5 days’ price predictions.

Next 5 days stock price prediction using LSTM multi step model.


Conclusion

This prediction is only short-term. The moment you try to predict for multiple days like the next 30-days or 60 days, this fails miserably. Not because our LSTM model is bad, but, because Stock markets are highly volatile. So, don’t bet your money simply on this model! Do some research and then use this model as a complementary tool for analysis!

I hope you enjoyed reading this post and it helped you clear some of your doubts regarding LSTM models. Consider sharing this post with your friends to help spread the knowledge 🙂

Author Details
Lead Data Scientist
Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

33 thoughts on “Predicting stock prices using Deep Learning LSTM model in Python”

  1. thank you so much for your code
    I just have question. What if i want to predict next 5 days price.
    how can I change your code?

      1. When using the previous 10 days to predict 5 days in the future, does the program use a “sliding window” to make its predictions? In other words, the 11th day knows about the previous 10 days, but does the 12th day know about the prediction made for the 11th day?

      2. Wonderful! Wonderful! Very good! Thank You! I would like to refer to the source code for LSTM and use it in a research paper on stock error correction using lstm. I want to get permission to use it. From Busan in Korea
        Thank you for your permission.

  2. Nathaniel Kren

    Thank you kindly for sharing your code! I used your model as a basis for my Capstone project in Computer Science. On that note, when was this post published? I want to properly reference your work in my paper.

    1. Hi Nathaniel
      I am very happy to see that this post helped you in your project. It was published on 5th-Oct-2020. You can reference it with this date.

  3. Hi,
    When predicting the next 5 days based on the previous 10 days, does the program use a sliding window? In other words, the 11th day makes a prediction based on the previous 10 days, but does the 12th day know about the result from the 11th day before it makes its prediction?

    1. Hi Rob!
      Yes your observation is correct. If you look at the LSTM diagram. You can see that the next time step(12th day) knows what was the prediction for previous time step(11th day). So on and so forth.

  4. I think there is something wrong with that part where you’re predicting prices for the next 5 days.
    So you’re using a batch size of 10 to predict a 5 days output.

    1st batch:
    input: from 9/7/2020 to 9/18/2020
    output: from 8/31/2020 to 9/4/2020

    2nd batch:
    input: from 9/8/2020 to 9/21/2020
    output: from 9/7/2020 to 9/11/2020
    And so on.
    I mean, look at the first batch. Your input is a window from 9/7 to 9/18 to predict prices from 8/31 to 9/4 that would be actually in the past.

    1. Farukh Hashmi

      Hi Samus,

      The model uses the last 10 days of prices to predict the next 5 days.
      Please print the X_test[-5: ] and y_test[-5: ] to check the last five input and output pairs that are fed to the LSTM model. It will make things clear for you.

    1. Hi Paddy,

      The Neural network begins by “randomly” initializing the weights for every neuron even if you keep all the parameters the same and even if you use a seed value. Hence, every time you train it, a slightly different accuracy % will be observed. This is expected behavior.

  5. Thank you for this insightful notebook. It has been of much help to me. I realized this is a univariate model, I want to know if it is possible to upload a multivariate for for us?. Thank you

    1. Hi Pius!

      I am glad to see that this post helped you! Yes, there is a multivariate version possible for LSTM, but I will suggest you to use ANN in such scenarios. I will try to upload a multivariate version for this notebook in the future! Currently, I am busy working on video tutorials of these concepts 🙂

  6. Hi , thank you for sharing this, How to use the combination of trade volumes,Open , High, Low values along with Close, as LSTM input. Instead of single script, can we use the LSTM to check more scripts using single model

    1. Hi Mengop!

      If you are using Open, High, Low prices as predictors, then it really becomes a supervised ML problem, instead of a sequence learning problem.
      Hence use of ANN/XGB etc. would be more sensible.

  7. I am having problem in running the 5 step forecasting code from the above code giving me ### Input Data Shape ###
    (840, 10, 1)
    ### Output Data Shape ###
    (840, 5, 1)

    1. Hi Suresh,

      The output data shape does not look correct.
      It should be (840,5). Make sure you don’t convert the output data in 3D shape.

      Regards,
      Farukh Hashmi

  8. hi just wondering why it takes a very long time to get the historical data here:
    StockData=get_history(symbol=’INFY’, start=startDate, end=endDate)
    Thanks

  9. hello,
    thank you so much for the detailed description and the details on the post. The tone and details of trivial information is really useful. For instance, the section “How to predict the stock price for tomorrow” is really useful.

  10. Hello Farukh,
    I have a question about the epochs. For instance, the use of epochs=100 seems to produce a more accurate model than the one with epochs=10.

    How would be the behavior if we set epcohs=200 or 1000?

    Overfitting? no real improvement?

    many thanks in advance

    1. Hi Joa,
      You are correct in observing that higher epoch produces more accurate model. But, there is a sweet spot value of epoch for every data which we need to find by trying out multiple values of epoch under the hyperparameter tuning step. Let’s say in this data you tried epoch=500 and it gave 90% accuracy as compared to epoch=10 which produced 60% accuracy. So you will try epoch= 600 which may give you 91% accuracy. The change in accuracy will hit a saturation level. That is the point we need to find. Because if you keep increasing the epochs, the test accuracy will start decreasing… That’s when you know it is overfitting.
      Hope that helps!

        1. Hi Sarah,

          Thank you for pointing this out! I generally use a third piece called validation set kept totally away from training and testing. On that data, the normalization is done using the normalization fit object of train data. I have not shown that concept in this case study.

          Regards,
          Farukh Hashmi

  11. Hi,
    I think, normalization should be applied after splitting (fit_transform to train, and fit to test), otherwise your model i gaining information from the test sample, which should’nt do.

    Regards,

    1. Hi Tony,

      I think you are pointing towards the validation data. Yes, you can keep some data separate apart from Train and test, and use the fit object from model training to normalize it while generating predictions for it. In this case study, I have not shown this part. Thank you for pointing 🙂

      Regards,
      Farukh Hashmi

  12. Magnus Christensen

    Hi,

    Great example!
    How would you alter this code for a dataset with more than one variable (more features)?

    Thank you in advance!

Leave a Reply!

Your email address will not be published. Required fields are marked *