Using LSTMs to Predict Stock Prices

栏目: IT技术 · 发布时间: 4年前

Face it. We’re dumb.

A blindfolded monkey could manage a portfolio better than any human ever could. No, I’m actually serious. Numerous studies show that monkeys outperform humans in the stock market all the time. A monkey allegedly generated 8x more profit in a quarter than traders on the New York Stock Exchange. How can a clueless primate beat the so-called “geniuses” of wall street? The answer, dumb luck .

Humans try to gauge and predict stock prices all the time, using fancy statistics and trends to figure it out. But the truth is, humans aren’t able to comprehend the different variables that go into a stock price. We aren’t able to synthesize a company’s reputation, the projects they’ve announced and are working on, track record, past stock market data and factoring all that into making informed decisions on the stock market. We can’t consider every single variable and stat that ever existed about a company. But, computers can. More specifically, neural networks are designed to do exactly that.

Long Short Term Memory Networks

A neural network in a mathematical sense is a differentiable function that takes in an input and computes an output. Essentially, it’s your standard f(x) but instead of only one variable “x”, there are 1,000,000’s of x’s or parameters. Our goal is to optimize these parameters, as then we can feed in any input and get the desired output. These are neural networks in a nutshell, but an LSTM has some special properties.

A Long-Short Term Memory neural network is comprised of LSTM units or cells. These units have special computations to them and pass their output along to the next unit as input. In short, the main goal of an LSTM is to account for data that was passed in before into the output. Things like time-series data or stock market data are dependent on past versions of itself, and using an LSTM, it remembers the past and tries to predict the future. Here’s how it works.

How data is propagated

In stock market data and generally data that relies on past iterations of itself, at different times data was different. At a certain time, a piece of data was X. This is called time steps and data is entered into an LSTM cell broken down into its corresponding time step. For example, the 3rd LSTM unit takes in the 3rd time step of data or X3.

Quick Notation: t-1 indicates the values of the last LSTM cell and t is the output of the current LSTM cell.

Cell State

The Cell State

The notation for C indicates the cell state of the LSTM. The cell state is a vector of values that are passed through each cell in its own path. The current cell can use this cell state in its calculations or change the cell state entirely. Because of this, the cell state vector acts as the long-term memory part of the neural network. This is because it interacts with every LSTM unit and can thus factor in every unit when calculating the output of the next unit. This is where the long-term memory aspect of the LSTM comes from.

Input Gate

The input gate takes in the output of the hidden state ( ht-1) of the last cell and the labelled data input (Xt) and uses this information to see if the cell state © should be changed. It first multiplies the data input and hidden state together, creating one new vector of values. It then takes the sigmoid and hyperbolic tangent function of this vector and multiplying those results together. If the product is above some trained parameter, it is then added to the cell state, essentially making the cell state memorize this important info. If it’s not added to the cell state, then it can be regarded as not important to the long term memory of the network. In short, the input gate determines if the information is important for the long term.

Forget Gate

The forget gate takes in the output of the hidden state at the previous time step (ht-1) and the data input of the current time step (Xt). It then multiplies them together and applies the sigmoid activation to it. If the output is above some trained parameter, then the cell state is completely reset to 0, essentially forgetting it’s long term memory. This is why it’s called the forget gate because it has the capability to forget everything that was learned thus far. This helps in time-series predictions as a piece of data will act as a reset to the current trend and the LSTM needs to factor that in. However, an LSTM will rarely pass the forget gate.

The output gate prepares the next hidden state vector for the next cell (ht). It factors in the last hidden state vector and input data and applies a few functions to it. This is the short-term memory aspect of the neural network, as it is propagating new information into the next cell.

Notice how all three gates work together to create the cell state ©. At the end of all the units, the cell state is passed through dense layers which makes sense of all the long term memory and important info about the data to create a prediction of the next time step. Everything leads up to creating this cell state and it’s the heart of the LSTM. Training the LSTM network is done to make sure that the long term info makes it out into the end. Now you have a good understanding of LSTMs, let’s see how I applied them to stock market data.

Stock Market Predictor

*Created using Tensorflow and Keras.

The Data

The data that was used for this project was Apple’s stock price over the last 5 years. It was broken down into time steps of 10 minutes each. A neural network needs examples to train and needs labelled data, so the data was inputted in a specific way. The input was 50 timesteps and the label (which is what the neural network is trying to predict) is the 51st timestep. The neural network tries to predict the price at the next time step, and the data was created accordingly to train the neural network.

The Architecture

Surprise, surprise! I used a lot of LSTMs for this neural network. Each LSTM had 96 cells in it and returns the cell state into the next LSTM as input. In the end, it had 2 dense layers that took in the output of the LSTM layers and made sense of it. The last dense layer has one node in it, which indicates that it outputs one number, the predicted value of the next time step. It had over 1 million parameters it could optimize. The dropout layers were used to make sure that the neural network wasn’t just memorizing the data and the output. I know, all this work just to find the next value for the next timestep.

model = Sequential([layers.LSTM(units=96, return_sequences=True, input_shape=(50,1)),layers.Dropout(0.2),
layers.LSTM(units=96, return_sequences=True),layers.Dropout(0.2),
layers.LSTM(units=96, return_sequences=True),layers.Dropout(0.2),
layers.LSTM(units=96, return_sequences=True),layers.Dropout(0.2),
layers.LSTM(units=96, return_sequences=True),layers.Dropout(0.2),
layers.LSTM(units=96),layers.Dropout(0.2),
layers.Dense(1)
])

Training

This model was trained using the popular Adam optimizer and MSE loss function. The loss function is how bad the model performed and the optimizer finds the minimum of the function, which are the best parameters for the network. It ran over 100 epochs and reached a loss < 0.00001 (it performs pretty good).

The Results are In

So these are few examples of stocks it was used to predict. The red shows the predicted price and the blue shows the actual price.

Overall, the model works decently well… but that doesn’t mean you should use this on the stock market. The stock market is very volatile and this model only uses past trends of a stock to predict the next.

References

https://colah.github.io/contact.html

https://unsplash.com/photos/Z05GiksmqYU

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Using LSTMs to Predict Stock Prices

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

Java 8实战

厄马(Raoul-Gabriel Urma)、弗斯科(Mario Fusco)、米克罗夫特(Alan Mycroft) / 陆明刚、劳佳 / 人民邮电出版社 / 2016-4-1 / CNY 79.00

本书全面介绍了Java 8 这个里程碑版本的新特性，包括Lambdas、流和函数式编程。有了函数式的编程特性，可以让代码更简洁，同时也能自动化地利用多核硬件。全书分四个部分：基础知识、函数式数据处理、高效Java 8 编程和超越Java 8，清晰明了地向读者展现了一幅Java 与时俱进的现代化画卷。一起来看看《Java 8实战》这本书的介绍吧!

码农工具