Vehicle Count Prediction
Vehicle Count Prediction
Vehicle Count Prediction
A PROJECT REPORT
Submitted in Partial Fulfillment of the requirements for the Award of the Degree of
BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY
By
D. VAMSI (171FA07056)
K.PAVAN (171FA07066)
in
INTERNET OF THINGS
Dr. N. Veeranjeneyulu
Professor
Vadlamudi,Guntur,A.P
India
CERTIFICATE
This is to certify that the dissertation entitled “Vehicle Count Prediction Using Sensor Data” is
submitted by D.VAMSI (171FA07056), G.SRAVANKUMAR (171FA07058), K.SRIHARISH
(171FA07062), K.PAVAN (171FA07066) in their partial fulfillment of the requirement of the
award of the degree of bachelor of technology, Vignan’s foundation for science technology and
research University,Guntur is a record of bonafide work carried out by them under my guidance
and supervision. The result embodied in this thesis has not been submitted to any other university
or institution for the award of any degree or diploma.
Professor, Professor,
EXTERNAL SIGNATURE
i
ACKNOWLEDGEMENT
It is indeed with a great pleasure and immense sense of gratitude that we acknowledge
the help of these individuals.
We feel elated in manifesting our sense of gratitude tour internal project guide Dr.N.
Veeranjeneyulu, Professor, Department of information Technology,VFSTR. He has been a
constant source of inspiration for us and we are very deeply thankful to him for his support and
valuable advice.
We extremely grateful to our Departmental staff members, Lab technicians and Non-teaching
staff members for their extreme help throughout our project.
Finally we wxpress our heartful thanks to all of our friends who helped us in successful
completion of this project.
D.VAMSI (171FA07056)
K.PAVAN (171FA07066)
ii
DECLARATION
We hereby declare that project titled “Vehicle count Prediction using Sensor data” is a
bonafide original record done by us at VFSTR, vadlamudi towards the partial fulfillment of
requirement for the award of degree of Bachelor of technology in Information Technology in
VFSTR, vadlamudi and also we state that this project has not been submitted anywhere in the
partial fulfillment for any degree of this or any other University.
Date
Place Signature
ii
ii
TABLE OF CONTENTS
1 Objective 1-1
2 Abstract 2-2
3 Problem Statement 3-3
4 Introduction 4-4
5 Dataset Description 5-5
5 Requirements 6-6
6 Methodology 7-7
7 Algorithm 8-8
8 Source Code 9-23
9 Evaluation Metric & Results 24-24
10 Conclusion 25-25
11 References 26-26
IOT Minor Project
OBJECTIVE
The main objective of this project is to apply different Machine Learning Algorithms on
IoT Sensor Data to predict the count of vehicles passed at particular junction.
Analyzing the Data
Finding the Hidden trends
Applying the Machine Learning algorithms
Dept of IT Page 1
IOT Minor Project
ABSTRACT
IoT devices are becoming popular nowadays. The widespread use of IoT yields huge amounts of
raw data. This data can be effectively processed by using machine learning to derive many useful
insights that can become game changers and affect our lives deeply. ML is becoming an essential
player in a growing array of process areas involving image recognition, natural language
processing, forecasting, prediction, and process optimization. ML is evolving to the point of
being able to draw interesting patterns and inferences from these real time data streams, and
make those results available to analysts as well as to embed them directly in business processes.
We are going to predict traffic patterns in each of these four junctions for the next 4 months
using Ensembling Techniques (Regression Analysis) and other Algorithms.
Dept of IT Page 2
IOT Minor Project
PROBLEM STATEMENT
You are working with the government to transform your city into a smart city. The vision is to
convert it into a digital and intelligent city to improve the efficiency of services for the citizens.
One of the problems faced by the government is traffic. You are a data scientist working to
manage the traffic of the city better and to provide input on infrastructure planning for the future.
The government wants to implement a robust traffic system for the city by being prepared for
traffic peaks. They want to understand the traffic patterns of the four junctions of the city. Traffic
patterns on holidays, as well as on various other occasions during the year, differ from normal
working days. This is important to take into account for your forecasting.
Your task
To predict traffic patterns in each of these four junctions for the next 4 months.
The sensors on each of these junctions were collecting data at different times, hence you will see
traffic data from different time periods. To add to the complexity, some of the junctions have
provided limited or sparse data requiring thoughtfulness when creating future projections.
Depending upon the historical data of 20 months, the government is looking to you to deliver
accurate traffic projections for the coming four months. Your algorithm will become the
foundation of a larger transformation to make your city smart and intelligent.
Dept of IT Page 3
IOT Minor Project
INTRODUCTION
Sensor data is the output of a device that detects and responds to some type of input from the
physical environment. The output may be used to provide information or input to another system
or to guide a process. An IoT system consists of sensors/devices which “talk” to the cloud
through some kind of connectivity. Once the data gets to the cloud, software processes it and
then might decide to perform an action, such as sending an alert or automatically adjusting
the sensors/devices without the need for the user.
With a sensor, a machine observes the environment and information can be collected.
A sensor measures a physical quantity and converts it into a signal. Sensors translate
measurements from the real world into data for the digital domain.
Dept of IT Page 4
IOT Minor Project
Train.csv (48120 X 4)
Variable Description
ID Unique ID
DateTime Hourly Datetime Variable
Junction Junction Type
Number of Vehicles
Vehicles
(Target)
Test.csv (11808 X 3)
Variable Description
ID Unique ID
DateTime Hourly Datetime Variable
Junction Junction Type
Dept of IT Page 5
IOT Minor Project
REQUIREMENTS
Software Requirements
Windows
Intel i3 processor
Hardware Requirements
Hard disk
Dept of IT Page 6
IOT Minor Project
METHDOLOGY
We are predicting the vehicle count at particular junction using date generated by a sensor.
The vehicle count is of numeric type. So we are going to apply Regression Techniques on the
data.
Regression:
A regression problem is when the output variable is a real or continuous value, such as “salary” or
“weight”. Many different models can be used, the simplest is the linear regression. It tries to fit
data with the best hyperplane which goes through the points.
Regression Analysis is a statistical process for estimating the relationships between the dependent
variables or criterion variables and one or more independent variables or predictors. Regression
analysis explains the changes in criterions in relation to changes in select predictors. The
conditional expectation of the criterions based on predictors where the average value of the
dependent variables is given when the independent variables are changed. Three major uses for
regression analysis are determining the strength of predictors, forecasting an effect, and trend
forecasting.
Ensembling Techniques:
Bagging and Boosting are two of the most commonly used techniques in machine learning.
Bagging algorithms:
RandomForest Regressor
Bagging Regressor
Boosting algorithms:
AdaBoost Regressor
Light GBM (LGBM Regressor)
CatBoost Regressor
Gradient Boosting Regressor
Decision Tree
Dept of IT Page 7
IOT Minor Project
ALGORITHM
Train
data
Load the data
Test
data Analyse the data
Build a Regression
model
Performance
Evaluation (RMSE)
Dept of IT Page 8
In [11]: import pandas as pd
import numpy as np
In [12]: df=pd.read_csv("C:/sravan//train.csv")
df1=pd.read_csv("C:/sravan//test.csv")
In [13]: df.head()
Out[13]:
DateTime Junction Vehicles ID
df['Year'] = pd.to_datetime(df['DateTime']).dt.year
df['Month'] = pd.to_datetime(df['DateTime']).dt.month
df['Day'] = pd.to_datetime(df['DateTime']).dt.day
df['Dayofweek'] = pd.to_datetime(df['DateTime']).dt.dayofweek
df['DayOfyear'] = pd.to_datetime(df['DateTime']).dt.dayofyear
df['Week'] = pd.to_datetime(df['DateTime']).dt.week
df['Quarter'] = pd.to_datetime(df['DateTime']).dt.quarter
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
df['Is_month_start'] = pd.to_datetime(df['DateTime']).dt.is_month_s
tart
df['Is_month_end'] = pd.to_datetime(df['DateTime']).dt.is_month_end
df['Is_quarter_start'] = pd.to_datetime(df['DateTime']).dt.is_quart
er_start
df['Is_quarter_end'] = pd.to_datetime(df['DateTime']).dt.is_quarter
_end
df['Is_year_start'] = pd.to_datetime(df['DateTime']).dt.is_year_sta
rt
df['Is_year_end'] = pd.to_datetime(df['DateTime']).dt.is_year_end
df['Semester'] = np.where(df['Quarter'].isin([1,2]),1,2)
df['Is_weekend'] = np.where(df['Dayofweek'].isin([5,6]),1,0)
df['Is_weekday'] = np.where(df['Dayofweek'].isin([0,1,2,3,4]),1,0)
df['Days_in_month'] = pd.to_datetime(df['DateTime']).dt.days_in_mon
th
df['Hour'] = pd.to_datetime(df['DateTime']).dt.hour
return df
In [15]: df=Create(df)
In [16]: df1=Create(df1)
In [17]: df.columns
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
'Is_month_end', 'Is_quarter_start', 'Is_quarter_end', 'Is_year_s
tart',
'Is_year_end', 'Semester', 'Is_weekend', 'Is_weekday', 'Days_in_
month',
'Hour'],
dtype='object')
In [18]: df1.columns
In [19]: target=df['Vehicles']
df=df.drop(['DateTime','Vehicles'],axis=1)
df1=df1.drop(['DateTime'],axis=1)
df['Year'].hist(figsize=(8,8),color="green")
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [20]: df['DayOfyear'].hist(figsize=(8,8))
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [21]: df['Dayofweek'].hist(figsize=(8,8),color="yellow")
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [22]: df['Year'].hist(figsize=(8,8),color="red")
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [23]: df['Month'].hist(figsize=(12,8))
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [24]: df['Year'].hist(figsize=(12,8))
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [25]: df['Day'].hist(figsize=(13,8))
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [26]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48120 entries, 0 to 48119
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Junction 48120 non-null int64
1 ID 48120 non-null int64
2 Year 48120 non-null int64
3 Month 48120 non-null int64
4 Day 48120 non-null int64
5 Dayofweek 48120 non-null int64
6 DayOfyear 48120 non-null int64
7 Week 48120 non-null int64
8 Quarter 48120 non-null int64
9 Is_month_start 48120 non-null bool
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
10 Is_month_end 48120 non-null bool
11 Is_quarter_start 48120 non-null bool
12 Is_quarter_end 48120 non-null bool
13 Is_year_start 48120 non-null bool
14 Is_year_end 48120 non-null bool
15 Semester 48120 non-null int32
16 Is_weekend 48120 non-null int32
17 Is_weekday 48120 non-null int32
18 Days_in_month 48120 non-null int64
19 Hour 48120 non-null int64
dtypes: bool(6), int32(3), int64(11)
memory usage: 4.9 MB
In [27]: df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11808 entries, 0 to 11807
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Junction 11808 non-null int64
1 ID 11808 non-null int64
2 Year 11808 non-null int64
3 Month 11808 non-null int64
4 Day 11808 non-null int64
5 Dayofweek 11808 non-null int64
6 DayOfyear 11808 non-null int64
7 Week 11808 non-null int64
8 Quarter 11808 non-null int64
9 Is_month_start 11808 non-null bool
10 Is_month_end 11808 non-null bool
11 Is_quarter_start 11808 non-null bool
12 Is_quarter_end 11808 non-null bool
13 Is_year_start 11808 non-null bool
14 Is_year_end 11808 non-null bool
15 Semester 11808 non-null int32
16 Is_weekend 11808 non-null int32
17 Is_weekday 11808 non-null int32
18 Days_in_month 11808 non-null int64
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
19 Hour 11808 non-null int64
dtypes: bool(6), int32(3), int64(11)
memory usage: 1.2 MB
In [28]: target
Out[28]: 0 15
1 13
2 10
3 7
4 9
..
48115 11
48116 30
48117 16
48118 22
48119 12
Name: Vehicles, Length: 48120, dtype: int64
In [ ]:
99.5207692247877
In [46]: r1.head()
Out[46]:
0
0 63.97
1 51.26
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
0
2 39.55
3 35.77
4 32.64
62.00277919615114
99.30552540068676
94.23985890770552
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
r5=pd.DataFrame(r5)
print(a.score(df,target)*100)
In [51]: r5.head()
Out[51]:
0
0 78.128751
1 68.017779
2 59.221489
3 52.589897
4 47.134557
4.258104738154613
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
r8=pd.DataFrame(r8)
print(a.score(df,target)*100)
88.64108507601487
In [63]: df1.to_csv(r"C:/sravan//results.csv")
In [ ]:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
IOT Minor Project
The evaluation metric for this competition is Root Mean Squared Error (RMSE).
Predicted Results:
RMSE Scores:
IOT Minor Project
CONCLUSION
Most of the Regression Algorithms performed well. After Feature Engineering RMSE error was
decreased. Among all, Random Forest Regressor is given high accuracy in 25 % test (Public)
data. Gradient Boosting regressor given best accuracy in 75% test (Private) data.
So Gradient Boosting Regressor is the best Regression Technique for this problem
IOT Minor Project
REFERENCES
1. https://datahack.analyticsvidhya.com/contest/janatahack-machine-learning-for-
iot/#ProblemStatement
2. https://www.google.com/search?safe=strict&rlz=1C1JZAP_enIN913IN913&sxsrf=ALe
Kk00HjvD6Aj8yCreDdhHxPgQYZVFTEA%3A1608041780996&ei=NMXYX6aoPLLj
z7sPiMmO0AU&q=machine+lenring%27+regression+paers+vehicle+sensor+data+iot&
oq=machine+lenring%27+regression+paers+vehicle+sensor+data+iot&gs_lcp=CgZwc3k
tYWIQAzIHCCEQChCgATIHCCEQChCgAToECAAQRzoJCAAQyQMQDRAeOgYI
ABANEB46CQgAEMkDEBYQHjoGCAAQFhAeOggIABAWEAoQHjoLCAAQyQMQ
CBANEB46CAghEBYQHRAeOgQIIRAKUKQ0WORsYNNtaAFwAXgDgAHAA4gBg
TKSAQowLjE3LjkuMS4ymAEAoAEBqgEHZ3dzLXdpesgBCMABAQ&sclient=psy-
ab&ved=0ahUKEwimko-5ltDtAhWy8XMBHYikA1oQ4dUDCA0&uact=5