DSBDA Mini Project
DSBDA Mini Project
On
“Covid-19 Vaccination Analysis”
A Report Submitted for a mini project for Data Science & Big Data Analytics
Laboratory (310256) in 2nd Semester of Third Year Computer Engineering
Academic Year 2023-24
Submitted by-
Sr No. Name of Student Roll No.
1. Aniket Prakash Salunkhe T213031
Guided by-
Prof. Rupali R. Jadhav
CERTIFICATE
This is to certify that Project Entitled “Covid-19 vaccination Analysis” carried out by Aniket
Salunkhe (T213031) are Bonafide students of this institute and the work has been carried out by
him under the supervision of Prof. Rupali Jadhav and it is approved for the partial fulfilment of the
requirement of Savitribai Phule Pune University, for the award of Third Year Engineering
(Computer Engineering). It is certified that all corrections/suggestions indicated for internal
assignment have been incorporated in the report. The project report has been approved as it satisfies
the academic requirements in respect of project work prescribed for the Bachelor of Engineering
Degree.
I take this opportunity to thank our project guide Prof. Rupali Jadhav and Head of
Department Prof. A. V. Mote for their valuable guidance and for providing all the necessary
facilities, which were indispensable in the completion of this project report. We are also thankful
to all the staff members of the Computer Engineering Department for their valuable time, support,
comments, suggestions and persuasion. We would also like to thank the institute for providing the
required facilities, Internet access and important books.
INDEX
1. ABSTRCT 01
2. SOFTWARE REQUIREMENT 02
3. INTRODUCTION 03
4. PROBLEM STATEMENT 04
6. THEORY 06
8. CONCLUSION 14
9. REFERENCES 15
ABSTRACT
In India, a large country of about 1.3 billion people, the disease was first detected on
January 30, 2020, in a student returning from Wuhan. The total number of confirmed infections
in India as of May 3, 2020, is more than 37,000 and is currently growing fast. Most of the prior
research and media coverage focused on the number of infections in the entire country. However,
given the size and diversity of India, it is important to look at the spread of the disease in each
state separately, where in the situations are quite different. In this report, we aim to analyse data
on the number of infected people in each Indian state using csv dataset and predict the number
of vaccinations for that state. We hope that such state wise predictions would help the state
governments better channelize their limited health care resources.
Additionally, the report addresses challenges and obstacles encountered during the
vaccination rollout, such as supply chain disruptions, vaccine hesitancy, and equity concerns.
Through predictive modeling and machine learning algorithms, we anticipate future trends in
vaccination uptake and identify strategies to overcome barriers to achieving herd immunity.
Overall, this project report provides valuable insights into the Covid-19 vaccination
campaign's progress and effectiveness, offering recommendations for policymakers, healthcare
providers, and public health officials to optimize vaccination efforts and combat the ongoing
pandemic effectively.
Keywords:
1
SOFTWARE REQUIREMENT
2
INTRODUCTION
The highly infectious coronavirus disease (COVID-19) was first detected in Wuhan, China
in December 2019 and subsequently spread to 212 countries and territories around the world,
infecting millions of people. In India, a large country of about 1.3 billion people, the disease was
first detected on January 30, 2020, in a student returning from Wuhan. The total number of
confirmed infections in India as of August 9, 2021, is more than 37,000 and is currently growing
fast. An effective rollout of vaccinations against COVID-19 offers the most promising prospect
of bringing the pandemic to an end. We present the Our World in Data COVID19 vaccination
dataset, a global public dataset that tracks the scale and rate of the vaccine rollout across the
Country. This dataset is updated regularly and includes data on the total number of vaccinations
administered, first and second doses administered, Male (Doses Administered), Female (Doses
Administered), Transgender (Doses Administered) for which data are available (28 state as of 9
August 2021). It will be maintained as the global vaccination campaign continues to progress.
Our intention is to maintain the database for the foreseeable future and include additional State
as they implement their vaccination campaigns. This dataset tracks the total number of COVID-
19 vaccinations administered, Number of persons state wise vaccinated for first dose in India
Number of persons state wise vaccinated for second dose in India Number of persons state wise
vaccinated for second dose in India, Number of Males vaccinated, Number of females vaccinated
each State. In this project using python libraries doing various operation on state wise covid 19
vaccination dataset and in this project use csv file dataset.
The emergence of the Covid-19 pandemic in late 2019 presented an unprecedented global
health crisis, challenging governments, healthcare systems, and communities worldwide. In
response, the scientific community mobilized with remarkable speed to develop vaccines
against the novel coronavirus, leading to the approval and distribution of multiple vaccines in
record time. By leveraging advanced data analytics techniques, we seek to explore various
dimensions of the vaccination campaign, including coverage rates, distribution strategies,
effectiveness, and associated challenges.
In this project use some libraries for analyzing and predicting data for analyzing data, we
need some libraries. In this section, we are importing all the required libraries like pandas,
NumPy, matplotlib, pyplot, seaborn, and word cloud that are required for data analysis
3
Problem Statement:
Use the following covid_vaccine_statewise.csv dataset and perform following analytics on
the given dataset
https://www.kaggle.com/sudalairajkumar/covid19-in-india?select=covid_vaccine_statewise.csv
a. Describe the dataset
b. Number of persons state wise vaccinated for first dose in India
c. Number of persons state wise vaccinated for second dose in India
d. Number of Males vaccinated.
Objective:
The main objective of the project on Covid19 Vaccination Analysis and Prediction is to
manage the details of state wise vaccination. It manages all the information about the individual
males and females, types of covid vaccine, total number of covid vaccine The project is totally built
at administrative end and thus only the administrator is guaranteed the access. The purpose of the
project is to analyse and predict the covid19 vaccination to reduce the manual work for managing
the course, prediction, Result. It tracks all the details about the male, female and total vaccination.
• Covid19 Vaccination Analysis and Prediction also manages the details for state wise total males and
female fully vaccinated.
• It tracks all the information of question, covid 19 vaccination etc. Manage the information and
description of the country vaccination
Outcome:
4
THEORY
Read the CSV file covid_vaccine_statewise.csv using pandas read_csv() function and
show the output using head() function.
3. Data Cleaning:
Dataset has many null values as we have seen before. To get rid of it we need to clean
the data first, after cleaning we will perform our further analysis. For cleaning the dataset,
we will perform many steps. Some of these steps are shown below
4. Data Pre-processing:
In this section, we are going to draw some visuals to get insights from our dataset.
describe() function in pandas used to get the statistics of each feature present in our dataset.
Some of the information we get include count, max, min, standard deviation, median, etc.
5
Commands & Output
6
# returns the shape of the dataset in the format of (rows, columns)
df.shape
(7845, 24)
7
A. Describe the dataset: df.describe()
8
C. Number of persons state wise vaccinated for second dose in India:
# returns the average of second dose administered
avg_seconddose = df["Second Dose Administered"].astype("float").mean(axis = 0)
print("Average of Second Dose:", avg_seconddose)
10
female = df["Female(Individuals Vaccinated)"].sum()
print("The total number of female individuals vaccinated are", int(female))
G. Data Visualization:
labels=['Male','Female']
11
sizes=[males_vaccinated, females_vaccinated]
colors=['cyan','salmon']
plt.figure(figsize=(4,4))
plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)
plt.title('Total Numbers of Males and Females Vaccinated')
plt.axis('equal')
plt.figure(figsize=(4,4))
plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)
plt.title('Total Numbers First and Second Dose Administered')
plt.axis('equal')
13
CONCLUSION
The COVID vaccination dataset provides vaccination status across India, detailing male
and female vaccination by state. Through visualization, we can analyze insights such as minimum,
maximum, count, standard deviation, and mean vaccination rates. To declare the end of the
pandemic, we look for a consistent decrease in the Daily Infection Rate (DIR) over 14 days until
it reaches zero or turns negative.
14
REFERENCES
https://www.w3schools.com/python/
https://www.kaggle.com/sudalairajkumar/covid19-in
india?select=covid_vaccine_statewise.csv
https://www.kaggle.com/sudalairajkumar/covid19-in-
india?select=covid_vaccine_statewise.csv
https://www.tableau.com/learn/articles/data-
visualization#:~:text=Data%20visualization%20is%20the%20graphical,outliers%2C%20a
nd%20patterns%20in%20data.
https://www.python.org/
Submitted By:
Aniket Salunkhe
Roll no.: T213031
Div: TE-C
15