6

First of all, I must say, I'm a beginner to this AI things. I followed most of the tutorials about stock market predictions and all of them are pretty much same. These tutorials using a data set and split in to two sets. First one is Training set and the 2nd one is Test set. They are using Closing price of the stocks to train and make a model. From that model, they insert test data set which contain the closing price and showing two graphs. Then they say the actual and the predicted graphs are pretty much same. The github repo of the tutorial. - https://github.com/surajr/Stock-Predictor-using-LSTM/blob/master/Stock-Predictor-using-LSTM.ipynb This is my question, 1. Why all those tutorials are putting closing price in the testing set also? They are only suppose to insert dates right? Because we are predicting the closing price. This is confusing. Please explain me. 2. No one is telling me how to predict next 7 days values. So if we have a model, how to get next 7 days closing value?

Please help me to clarify this. Thanks a lot.

4 Answers4

1

Take a look at this link. I think it will get you going in the right direction.

https://www.datacamp.com/community/tutorials/lstm-python-stock-market

ASH
  • 15,523
  • 6
  • 50
  • 116
  • Many thanks , ASH! Very usefull. This link has link to nice explanation of LSTM. – harp1814 Jan 25 '21 at 20:17
  • Good luck with it. This stuff really works! I made just over $500k in the market last year, by following the LSTM strategy, and a handful of other investment strategies. – ASH Jan 25 '21 at 20:43
0

Why all those tutorials are putting closing price in the testing set also?

The ultimate goal is to predict the movement (growth), Which is closing minus- opening price. The ultimate model is the model that calculates the growth in test data set very close to what the actual growth is. The growth is the main problem that the model is trying solve and is the point of reference when you calculate the accuracy of the trained model.

They are only suppose to insert dates right? Because we are predicting the closing price

The model is predicting the growth based on given factors. For a company, you have many factors that are quantified, per day. I suspect the tutorial you did uses a testing set extracted for one particular day and different stocks. Like extracting all parameters for all companies but only in 10th of January and then check how accurate the trained model is. The training set on the other hand contains the stock for more than one day most of the time.

No one is telling me how to predict next 7 days values. So if we have a model, how to get next 7 days closing value?

To predict the stock price relatively accurate, you need a well-trained model. To do this you need to train your model based on many many factors. Same model cannot predict stock in different countries. One model might be suitable to predict technology stocks (AAPL) but not other fields.

Overall, this is a complicated subject. Financial advisers pay a massive amount of money just to use reliable models. Most of them use multiple models based on their client's portfolio. These tutorials introduce the subject to you and teach you the main concept. IMHO, I would say the next step would be learning and then competing in Kaggle.
Community
  • 1
  • 1
0

In the training set, closing value is included as an input because it is relevant to the "next day's" price, or "price in X days" (for models that predict price movement over more than 1 day).

Note, in the training data, typically the future price (today + 1 day) is the target value (train_Y).

In the testing data, the closing data is included because the testing data is predicting "future price."

In determining the accuracy of the model, the price prediction of (today + X days) is compared against the future value (test_Y) to determine the effectiveness of the prediction. Just like a human stock trader, if you are guessing/predicting if the FUTURE price will be Y (i.e. up/down), then you would have access to the current day's end of day closing price...which is why it is a relevant input. Obviously, in a real-world model, the accuracy of the prediction would only be known AFTER X days pass. When training and then testing a model, typically the data is historical, so out of sample values (like the price of today + X days) is used for accuracy determination, though the FUTURE value should definitely not be an input.

-1
  1. Why all those tutorials are putting closing price in the testing set also? -> It is easy to understand that closing price is a kind of input variable which is required to calculate stock price.

  2. As I see the code, it seems predict stock price with 22days history

X_train (1173, 22, 3) y_train (1173,) X_test (130, 22, 3) y_test (130,)

I think you should re-train with (~~~, 7, 3) to predict price of 7 days after today.

puhuk
  • 316
  • 3
  • 9
  • So, I can understand that the closing price is good to predict future price. My question is, Why are we using it in Testing set also. Training set, It is fine. But in testing set why are we using? Aren't we suppose to predict the closing price using test data set? – Pasindu Dineth Peiris Feb 18 '20 at 05:38