Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Report on

Stock Price Prediction using


Machine Learning – LSTM Neural
Network

Dr. Shaveta Arora

Presented By:
Parth Bathla 18ECU016
Parth Bathla 18ECU016
ACKNOWLEDGEMENT

I take this opportunity to express my profound gratitude and deep regards to Dr. Shaveta
Arora ma’am for their exemplary guidance, monitoring and constant encouragement in
completion of this report. Their efforts shall carry me a long way in the journey of life
on which I am about to embark.

I also thank my friends who have always been there with me whenever I needed their
help.
Abstract

Accurate prediction of stock market returns is a very challenging task due to volatile and
non-linear nature of the financial stock markets. In this project we are going to present
and review a more feasible method to predict the stock movement with higher accuracy.
The first and foremost thing we have considered is the dataset of the stock, which here
we have referred to the previous stock market price report of TATA Global pvt. ltd. The
dataset was pre-processed and tuned up for getting real time stock analysis.

The Stock market price is calculated to maximize the profit and minimize the loses,
techniques to predict values of the stock in advance by analysing the trend over the last
few years, could prove to be highly useful for making stock market movements.

Here in this project, we will be using both technical as well as qualitative analysis, where
the historical prices of stock closing and opening of the company as well as the externals
factors of the company such as profile, market situation, political aspects etc will be taken
into consideration.

The dataset used for prediction of Stock prices are very huge in size and non-linear, thus
to deal with such kind of vast variety of data an efficient model is needed that can identify
the hidden patterns and complex relations of the dataset. Therefore, machine learning is
used as the technique to improve the efficiencies by almost 70-86percent as compared to
past methods of prediction.

Stock market prediction outperforms when it is treated as a regression problem but


performs well when treated as a classification. The aim is to design a model that gains
from the market information utilizing machine learning strategies and gauge the future
patterns in stock value development.
Dataset (Tata Global)

To build the stock price prediction model, we will use the NSE TATA GLOBAL
dataset. This is a dataset of Tata Beverages from Tata Global Beverages Limited,
National Stock Exchange of India: Tata Global Dataset

There are multiple variables in the dataset – date, open, high, low, last, close,
total_trade_quantity, and turnover.

• The columns Open and Close represent the starting and final price at which the
stock is traded on a particular day.
• High, Low and Last represent the maximum, minimum, and last price of the share
for the day.
• TotalTrade Quantity is the number of shares bought or sold in the day
and Turnover (Lacs) is the turnover of the company on a given date.

The profit or loss calculation is usually determined by the closing price of a stock for the
day; hence we will be considering the closing price as the target variable.
Introduction

Machine Learning has significant applications in the stock price prediction. In this
machine learning project, we will be talking about predicting the returns on stocks using
the LSTM (Long short-term Memory) neural network.

What is LSTM?
Long short-term memory (LSTM) is an artificial recurrent neural network (RNN)
architecture used in the field of deep learning. Unlike standard feedforward neural
networks, LSTM has feedback connections.

LSTM is able to store past information that is important and forget the information that
is not. LSTM has three gates:

• The input gate: The input gate adds information to the cell state
• The forget gate: It removes the information that is no longer required by the
model
• The output gate: Output Gate at LSTM selects the information to be shown as
output

Predicting how the stock market will perform is one of the most difficult things to do.
There are so many factors involved in the prediction -physical factors vs psychological,
rational, and irrational behaviour etc. All these aspects combine to make share prices
volatile and very difficult to predict with a high degree of accuracy.

In this article, we will work with historical data about the stock prices of a publicly listed
company. We will implement a mix of machine learning algorithms to predict the future
stock price of this company, starting with simple algorithms like averaging and linear
regression, and then move on to advanced techniques like Auto ARIMA and LSTM.
Problem Statement

We will dive into the implementation part of this article soon, but first it’s important to
establish what we’re aiming to solve. Broadly, stock market analysis is divided into two
parts – Fundamental Analysis and Technical Analysis.

• Fundamental Analysis involves analysing the company’s future profitability


based on its current business environment and financial performance.

• Technical Analysis, on the other hand, includes reading the charts and using
statistical figures to identify the trends in the stock market.

As you might have guessed, our focus will be on the technical analysis part. We’ll be
using a dataset from Quandl (you can find historical data for various stocks here) and for
this particular project, I have used the data for ‘Tata Global Beverages’. Time to dive
in!
Library Imports:

#import packages
import pandas as pd
import numpy as np

#to plot within notebook


import matplotlib.pyplot as plt
%matplotlib inline

#setting figure size


from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 20,10

#for normalizing data


from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))

#including LSTM model library


from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
Price Prediction Code
Jupyter File Link: Stock Price Prediction using LSTM.ipynb

Making read the .csv file


#read the file
df = pd.read_csv("D:\\College Files\\AI,ML,DL\\NSE-TATAGLOBAL11.csv")

#print the head and looking at first five rows of the data
print(df.head())
print('\n Shape of the data:')
print(df.shape)

X = df.iloc[:, [2, 3]].values


y = df.iloc[:, 4].values
Date Open High Low Last Close Total Trade Quant
ity \
0 2019-10-08 208.00 222.25 206.85 216.00 215.15 464
2146
1 2019-10-05 217.00 218.60 205.90 210.25 209.20 351
9515
2 2019-10-04 223.50 227.80 216.15 217.25 218.20 172
8786
3 2019-10-03 230.00 237.50 225.75 226.45 227.60 170
8590
4 2019-10-01 234.55 234.60 221.05 230.30 230.90 153
4749

Turnover (Lacs)
0 10062.83
1 7407.06
2 3815.79
3 3960.27
4 3486.05

Shape of the data:


(1235, 8)

Analyzing the closing price


#Setting index as date
df["Date"]=pd.to_datetime(df.Date,format="%Y-%m-%d")
df.index=df['Date']
#plot
plt.figure(figsize=(16,8))
plt.plot(df["Close"],label='Close Price history')

Sorting the dataset acc. to date and closing price


#creating dataframe
data = df.sort_index(ascending=True, axis=0)
new_data = pd.DataFrame(index=range(0,len(df)),columns=['Date', 'Close'])

for i in range(0,len(data)):
new_data['Date'][i] = data['Date'][i]
new_data['Close'][i] = data['Close'][i]

#setting index
new_data.index = new_data.Date
new_data.drop('Date', axis=1, inplace=True)

Normalize the data into train and valid sets


#creating train and valid sets
dataset = new_data.values
train = dataset[0:987,:]
valid = dataset[987:,:]
Converting data
#converting dataset into x_train and y_train
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)

x_train, y_train = [], []


for i in range(60,len(train)):
x_train.append(scaled_data[i-60:i,0])
y_train.append(scaled_data[i,0])
x_train, y_train = np.array(x_train), np.array(y_train)

x_train = np.reshape(x_train, (x_train.shape[0],x_train.shape[1],1))

Creating LSTM network


# create and fit the LSTM network
'''
Long Short-Term Memory (LSTM) networks are a type of recurrent neural network
capable of learning order dependence in sequence prediction problems.
'''

model = Sequential()
model.add(LSTM(units=50, return_sequences=True,
input_shape=(x_train.shape[1],1)))
model.add(LSTM(units=50))
model.add(Dense(1))

model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train, epochs=1, batch_size=1, verbose=2)
927/927 - 28s - loss: 0.0015

#predicting values, using past 60 values from the train data


inputs = new_data[len(new_data) - len(valid) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = scaler.transform(inputs)
X_test = []
for i in range(60,inputs.shape[0]):
X_test.append(inputs[i-60:i,0])
X_test = np.array(X_test)

X_test = np.reshape(X_test, (X_test.shape[0],X_test.shape[1],1))


closing_price = model.predict(X_test)
closing_price = scaler.inverse_transform(closing_price)

Plotting Final prediction

#for plotting
train = new_data[:987]
valid = new_data[987:]
valid['Predictions'] = closing_price
plt.plot(train['Close'])
plt.plot(valid[['Close','Predictions']])

[<matplotlib.lines.Line2D at 0x18145e96100>,
<matplotlib.lines.Line2D at 0x18145e961f0>]

from sklearn.metrics import accuracy_score


accuracy_score=np.sqrt(np.mean(np.power((valid-closing_price),2)))
accuracy_score
Close 10.174441
Predictions 0.000000
dtype: float64
Research Papers References:

• First research paper


• Second research paper
• Third research paper

You might also like