Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

MACHINE LEARNING WITH FLASK

CHAPTER 1
INTRODUCTION
1.1 COURSE OBJECTIVES
The objective of Machine Learning is to discover patterns in the user data and then
make predictions based on these and intricate patterns for answering business
questions and solving business problems.

i. To gain skills and knowledge:


This internship provided us with essential skills and knowledge one requires
in the field of basic of python and Machine Learning. The crucial tools used
during the tenure helped us in gaining knowledge about programming
languages.
ii. To get field work experience:
By taking this training we enhanced our knowledge in Machine Learning and
got insight into how the model is developed using Machine Learning
algorithm.
iii. To enhance communication skills:
By interacting with my trainee and classmates I got to learn a lot. It helped me
to enhance my communicative skills and represent my work with confidence.
It boosted my confidence to design more web pages and create some great
designs just for fun.
iv. To build a network:
By learning how to code in python and about packages it is easy to build and
design our own model with better accuracy.

1.2 NEED FOR INTERNSHIP

1. Application of education and career exploration.


Internships are a great way to connect classroom knowledge to real-world experience.
Dept. of CSE, SVCE 2021-2022 Page | 1
MACHINE LEARNING WITH FLASK

Learning is one thing, but taking those skills into the workforce and applying them is a
great way to explore different career paths and specializations that suit individual
interests.
2. Gain experience and increase marketability.
Having an internship gives you experience in the career field you want to pursue. Not
only does this give individuals an edge over other candidates when applying for jobs,
but it also prepares them for what to expect in their field and increases confidence in
their work.
3. Networking.
Having an internship benefits you in the working environment, and it also builds your
professional network. Internships provide a great environment to meet professionals in
the career field you want to pursue, as well as other interns who have similar interests.
4. Professionalism.
Internships can provide students with the soft skills needed in the workplace and in
leadership positions. Skills, such as communication, leadership, problem-solving, and
teamwork can all be learned through an internship and utilized beyond that experience.
5. Learn how a professional workplace operates.
Internships help students learn all about workplace culture, employee relations, and
leadership structure, which should help them on board in their first professional job with
more ease than if they haven’t had professional experience.

1.3 THE BENEFITS OF INTERNSHIP

1.3.1 Benefits to the Industry

a. Availability of ready to contribute candidates for employment.

b. Year-round source of highly motivated pre-professionals.

c. Students bring new perspectives to problem solving.

Dept. of CSE, SVCE 2021-2022 Page | 2


MACHINE LEARNING WITH FLASK

d. Visibility of the organization is increased on campus.

e. Quality candidate’s availability for temporary or seasonal positions and


project.

f. Freedom for industrial staff to pursue more creative projects.

g. Availability of flexible, cost-effective work force not requiring a long-term


employer commitment.

h. Proven, cost-effective way to recruit and evaluate potential employees.

i. Enhancement of employer’s image in the community by contributing to the


educational enterprise.

1.3.2 Benefits to Students


a. An opportunity to get hired by the industry/ organization.

b. Practical experience in an organizational setting.

c. Excellent opportunity to see how the theoretical aspects learned in classes are
integrated into the practical world. On-floor experience provides much more
professional experience which is often worth more than classroom teaching.
d. Helps them decide if the industry and the profession is the best career option to
pursue.

e. Opportunity to learn new skills and supplement knowledge.

f. Opportunity to practice communication and teamwork skills.

g. Opportunity to learn strategies like time management, multi-tasking etc in an


industrialsetup.
h. Opportunity to meet new people and learn networking skills.

i. Makes a valuable addition to their resume.

j. Enhances their candidacy for higher education.


Dept. of CSE, SVCE 2021-2022 Page | 3
MACHINE LEARNING WITH FLASK

k. Creating network and social circle and developing relationships with industry
people.

l. Provides opportunity to evaluate the organization before committing to a full-


timeposition.

1.3.3 Benefits to the Institute

a. Build industrial relations.

b. Makes the placement process easier.

c. Improve institutional credibility & branding.

d. Helps in retention of the students.

e. Curriculum revision can be made based on feedback from Industry/ students.

f. Improvement in teaching learning process

1.4 COMPANY PROFILE

Livewire is a division of CADD Centre Training services head quartered in Chennai, India.
CADD Centre as a training services company was formed in 1988 and has now established
several brands across various domains focusing on technical skill development of students
and professionals. Livewire was established in the year 2013, under CADD Centre training
Services, to bring all specialization on Electronics and IT domains under one radar.
Livewire delivers NSDC approved trainings and is also authorized by MSME to deliver
trainings and internships to students. As for the training delivery methodology, CADD
Centre and its brands are ISO 9001:2015/29990:2010 certified for the quality of training
delivery methods and standards.

Dept. of CSE, SVCE 2021-2022 Page | 4


MACHINE LEARNING WITH FLASK

Fig 1.4.1 Company Logo

Services Offered By The Organization:

➢ Technical Training to Corporate.

➢ Technical Training to Students.

➢ MSME approved Internship Programs.

➢ NSDC Approved coaching as per the guidelines.

➢ Software Sales and Implementation services.

Dept. of CSE, SVCE 2021-2022 Page | 5


MACHINE LEARNING WITH FLASK

CHAPTER 2

MACHINE LEARNING WITH PYTHON

2.1 SCOPE OF DOMAIN


Machine learning is an application of artificial intelligence (AI) that provides systems the
ability to automatically lean and improve from experience without being explicitly
programmed. Machine learning focuses on the development of computer programs that can
access data and use it lean for themselves.

The process of learning begins with observations or data, such as examples, direct
experience, or instruction, in order to look for patterns in data and make better decisions in
the future based on the examples that we provide. The primary aim is to allow the computers
learn automatically without human intervention of assistance and adjust actions accordingly.

2.1.1 TYPES OF MACHINE LEARNING

The types of machine learning algorithms differ in their approach, the type of data they
input and output, and the type of task or problem that they are intended to solve. Broadly
Machine Learning can be categorized into four categories.

1. Supervised Learning

Supervised Learning is a type of learning in which we are given a data set and we already
know as what correct outputs are should look like, having the idea that there is a
relationship between the input and output. Basically, it is learning task of learning a
function that maps an input to an output based on example input-output pairs. It infers a
function from labeled training data consisting of a set of training examples. Supervised
learning problems are categorized. In Supervised learning, an AI system is presented with
data which is labelled, which means that each data tagged with the correct label. The goal

Dept. of CSE, SVCE 2021-2022 Page | 6


MACHINE LEARNING WITH FLASK

is to approximate the mapping function so well that when you have new input data (x) that
you can predict the output variables (Y) for that data.

2. Unsupervised Learning

Unsupervised Learning is a type of learning that allows us to approach problems with


little or no idea what our problem should look like. We can derive the structure by
clustering the data based on a relationship among the variables in data. With unsupervised
learning there is no feedback based on prediction result. Basically, it is a type of self-
organized learning that helps in finding previously unknown patterns in data set without
pre-existing label. In unsupervised learning, an AI system is presented with unlabeled,
uncategorized data and the system’s algorithms act on the data without prior training.
The output is dependent upon the coded algorithms. Subjecting a system to unsupervised
learning is one way of testing AI.

3. Semi-Supervised Learning

Semi-supervised learning fall somewhere in between supervised and unsupervised


learning, since they use both labeled and unlabeled data for training– typically a small
amount of labeled data and a large amount of unlabeled data. The systems that use this
method can considerably improve learning accuracy. Usually, semi-supervised learning
is chosen when the acquired labeled data requires skilled and relevant resources to train
it / learn from it. Otherwise, acquiring unlabeled data generally doesn’t require.

4. Reinforcement Learning

Reinforcement learning is a learning method that interacts with its environment by


producing actions and discovers errors or rewards. Trial and error search and delayed
reward are the most relevant characteristics of reinforcement learning. This method
allows machines and software agents to automatically determine the ideal behavior
within a specific context to maximize its performance. Simple reward feedback is
required for the agent to learn which action is best. A reinforcement learning algorithm,
or agent, learns by interacting with its environment. The agent receives rewards by

Dept. of CSE, SVCE 2021-2022 Page | 7


MACHINE LEARNING WITH FLASK

performing correctly and penalties for performing incorrectly. The agent learns without
intervention from a human by maximizing its reward and minimizing its penalty. It is a
type of dynamic programming that trains algorithms using a system of reward and
punishment.

2.1.2 OBJECTIVES

➢ To understand the basics of Python Programming and its Libraries.

➢ To understand the method of data analysis, algorithms and mathematical models to train
the sample data in Machine Learning.

➢ To discover patterns in the user data and then make predictions based on these and
intricate patterns for answering business questions and solving business problems.

➢ To analysis the data as well as identifying trends.

➢ To come up with computer programs that have the capability to improve themselves
based on new data without requiring any explicit programming for the same.

➢ To build websites and software, automate tasks, and conduct data analysis and develop
prototypes.

2.1.3 APPLICATIONS

1. Virtual Personal Assistants: Names like Siri and Alexa bring to mind the capabilities of
virtual assistants. We can ask Siri to make a call for you or play music. You can request
Alexa for today’s weather forecast. You can even set an alarm or send an SMS. What makes
this easier on you is that you only need to speak to it and it will listen to your command. This
comes in handy for those differently abled. Such assistants take note of how you interact with
them and use that to make your next experience with them better.

2. Online Customer Support: Websites like educators and shopping platforms will often
pop a live chat up to help you with your questions. A visitor with a head full of questions is
more likely to leave than stay and possibly make a purchase. Some websites use a chat-bot
instead to pull information to the website and try to address the customer’s queries.

Dept. of CSE, SVCE 2021-2022 Page | 8


MACHINE LEARNING WITH FLASK

3. Online Fraud Detection: If you’re familiar with PayPal, you realize your trust with it. It
uses machine learning to stand in defense against illegal acts like money laundering. By
comparing millions of transactions, it can find out which ones are illegitimate.

4. Product Recommendations: Shopping platforms like Amazon and Jabong notice what
products you look at and suggest similar products to you. If this gets a favorite product across
to you and results in a purchase you make with them, it’s a win for them. For this, it also uses
your wish-list and cart contents.

5. Automatic Translation: Machine Learning lets us translate text into another language.
The ML algorithm for this figure how words fit together and then uses this information to
improve the quality of a translation

2.1.4 INTRODUCTION TO PYTHON

Python is a widely used high-level, general-purpose, interpreted, dynamic programming


language. Its design philosophy emphasizes readability, and its syntax allows programmers
to express concepts in fewer lines of code than would be possible in languages such as C++
or Java. The language provides constructs intended to enable clear program on both a small
and large scale.

Python supports the multiple programming paradigms, including object-oriented,


imperative and functional programming or procedural styles. It features a dynamic types
system and automatic memory management and has a large and comprehensive standard
library. Python interpreters are available for installation on any operating systems, allowing
Python code execution on a wide variety of systems.

Dept. of CSE, SVCE 2021-2022 Page | 9


MACHINE LEARNING WITH FLASK

2.1.4.1 PYTHON LIBRARIES

NumPy

NumPy, short for Numerical Python, is the foundational package for scientific computing
in Python. The majority of this book will be based on NumPy and libraries built on top
of NumPy It provides, among other things

➢ A fast and efficient multidimensional array object and array


➢ Functions for performing element-wise computations with arrays or mathematical
operations between arrays
➢ Tools for reading and writing array-based data sets to disk Linear algebra operations,
Fourier transform, and random number generation
➢ Tools for integrating connecting C. CH, and Fortran code to Python
Beyond the fast array-processing capabilities that NumPy adds to Python, one of its
primary purposes with regards to data analysis is as the primary container for data
to be passed between algorithms. For numerical data, NumPy arrays are a much
more efficient way of storing and manipulating data than the other built-in Python
data structures. Also, libraries written in a lower-level language, such as C or
Fortran, can operate on the data stored in a NumPy array without copying any data.

Pandas

Pandas provide rich data structures and functions designed to make working with structured
data fast, easy, and expressive. It is, as you will see, one of the critical ingredients enabling
Python to be a powerful and productive data analysis environment. The primary object in
pandas that will be used is the Data Frame, a two dimensional tabular, column-oriented data
structure with both row and column label pandas combine the high-performance array
computing features of NumPy with the flexible data manipulation capabilities of
spreadsheets and relational databases (such as SQL). It provides sophisticated indexing
functionality to make it easy to reshape, slice and dice, perform aggregations, and select

Dept. of CSE, SVCE 2021-2022 Page | 10


MACHINE LEARNING WITH FLASK

subsets of data pandas is the primary tool. For financial users, pandas feature rich, high-
performance time series functionality and tools well-suited for working with financial data.
The pandas name itself is derived from panel data, an econometrics term for
multidimensional structured datasets, and Python data analysis itself.

SkLearn

SkLearn is a library in Python that provides many unsupervised and supervised learning
algorithms. It's built upon some of the technology you might already be familiar with, like
NumPy, pandas, and Matplotlib!

The functionality that scikit-learn provides include:


➢ Regression, including Linear and Logistic Regression.
➢ Classification, including K-Nearest Neighbors.
➢ Clustering, including K-Means and K-Means++ Model selection.
➢ Preprocessing, including Min-Max Normalization.

Matplotlib

Matplotlib is a python library used to create 2D graphs and plots by using python scripts. It
has a module named pyplot which makes things easy for plotting by providing feature to
control line styles, font properties, formatting axes etc. It supports a very wide variety of
graphs and plots namely histogram, bar charts, power spectra, error charts etc. It is used along
with NumPy to provide an environment that is an effective open-source alternative for
Matlab. It can also be used with graphics toolkits like PyQt and wxPython.

Dept. of CSE, SVCE 2021-2022 Page | 11


MACHINE LEARNING WITH FLASK

2.2 TOOLS NEED TO BE IMPLEMENTED WITH DETAILS

Anaconda

Anaconda distribution is a free and open-source platform for Python programming languages. It
can be easily installed on any OS such as Windows, Linux, and MAC OS. It provides more than
1500 Python or data science packages which are suitable for developing machine learning and
deep learning models. Anaconda distribution provides installation of Python with various IDE's
such as Jupyter Notebook, Spyder, Anaconda prompt, etc. Hence it is a very convenient packaged
solution which you can easily download and install in your computer. It will automatically install
Python and some basic IDEs and libraries with it.

Jupyter Notebook

The Jupyter Notebook is the original web application for creating and sharing computational
documents. It offers a simple, streamlined, document-centric experience. The Jupyter Notebook is
an open-source web application that you can use to create and share documents that contain live
code, equations, visualizations, and text. Jupyter Notebook is maintained by the people at Project
Jupyter.

Fig 2.2.1 Jupyter Notebook

Dept. of CSE, SVCE 2021-2022 Page | 12


MACHINE LEARNING WITH FLASK

Jupyter Interface

Now you’re in the Jupyter Notebook interface, and you can see all the files in your current
directory. All Jupyter Notebooks are identifiable by the notebook icon next to their name. If you
already have a Jupyter Notebook in your current directory that you want to view, find it in your
files list and click it to open.

Fig 2.2.2 Jupyter Interface

FLASK

A web framework is an architecture containing tools, libraries, and functionalities suitable to build
and maintain massive web projects using a fast and efficient approach. They are designed to
streamline programs and promote code reuse. To create the server-side of the web application, you
need to use a server-side language. Python is home to numerous such frameworks, famous among
which are Django and Flask. Python Flask Framework is a lightweight micro-framework based on
Werkzeug, Jinja2. It is called a micro framework because it aims to keep its core functionality
small yet typically extensible to cover an array of small and large applications. Flask Framework
depends on two external libraries: The Jinja2 template, Werkzeug WSGI toolkit. Even though we
have a plethora of web apps at our disposal, Flask tends to be better suited due to -
Dept. of CSE, SVCE 2021-2022 Page | 13
MACHINE LEARNING WITH FLASK

➢ Built-in development server, fast debugger.


➢ Integrated support for unit testing.
➢ RESTful request dispatching.
➢ Jinja2 Templating.
➢ Support for secure cookies.
➢ Lightweight and modular design allows for a flexible framework.

Dept. of CSE, SVCE 2021-2022 Page | 14


MACHINE LEARNING WITH FLASK

CHAPTER 3
WORK CARRIED OUT
3.1 PROBLEMS/CHALLENGES

Task 1: To learn about the basics of Python and packages in Python.


Task 2: To learn about basic concepts of Machine learning.
Task 3: To learn about the working and implementation of machine learning algorithms.

Task 4: To design credit risk classification model using flask.

The work carried during the internship is completely shown in the below table

Dept. of CSE, SVCE 2021-2022 Page | 15


MACHINE LEARNING WITH FLASK

3.2 PLAN OF WORK

Fig 3.2.1 Workflow Diagram

Dept. of CSE, SVCE 2021-2022 Page | 16


MACHINE LEARNING WITH FLASK

3.2METHODOLOGY

3.3.1 Data Source

The credit risk classification dataset is collected from the Kaggle online repository. This
dataset contains Customer Transaction and Demographic related data. It holds Risky and
Not Risky customer for specific banking products
• Features of dataset

payment_data.csv:
payment data.csv: customer’s card payment history.
id: customer id
OVDt1: number of times overdue type 1
OVDt2: number of times overdue type 2
OVDt3: number of times overdue type 3
OVDsum: total overdue days
paynormal: number of times normal payment
prodcode: credit product code
prodlimit: credit limit of product
updatedate: account update date
newbalance: current balance of product
highestbalance: highest balance in history
reportdate: date of recent payment

customer_data.csv:
customer’s demographic data and category attributes which have been encoded.
Category features are fea1, fea3, fea5, fea6, fea7, fea9.
label is 1, the customer is in high credit risk
label is 0, the customer is in low credit risk

3.3.2 Dataset collection

The data collection process involves the selection of quality data for analysis. Here we used
credit risk classification dataset taken form Kaggle. Here, we have found different ways and
sources for collecting relevant and comprehensive data, interpreting it, and analyzing results
with the help of statistical techniques.

Dept. of CSE, SVCE 2021-2022 Page | 17


MACHINE LEARNING WITH FLASK

Fig 3.3.2.1 Customer dataset

Fig 3.3.2.2 Payment Dataset

Dept. of CSE, SVCE 2021-2022 Page | 18


MACHINE LEARNING WITH FLASK

3.3.3 DATA VISUALIZATION

A large amount of information represented in graphic form is easier to understand and


analyze. Some companies specify that a data analyst must know how to create slides,
diagrams, charts, and templates. In our approach, the detected how many customers
are credit risk or not.

Fig 3.3.3.1 Plot label vs count

3.3.4 Data pre-processing


Cleaning: Data that we want to process will not be clean that is it may contain noise
or may contain values missing of we process we can’t get good results so to obtain
good an perfect results we need to eliminate all this, the process to eliminate all this
is data cleaning We will fill missing values and can remove noise by using some
techniques like filling missing values and can remove noise by using techniques like
filling with most common value in missing place.

Dept. of CSE, SVCE 2021-2022 Page | 19


MACHINE LEARNING WITH FLASK

Transformation: This involves changing data format to one form to other that is
making them most understandable by doing normalization, smoothing, and
generalization, aggregation techniques on data.

Integration: Data that we need not process may not be from a single source
sometimes it can be from different sources we do not integrate them it may be a
problem while processing integration is one of important phase in data pre-processing
and different issues considered here to integrate.

Reduction: When we work on data it may be complex and it may be difficult to


understand sometimes so to make them understandable to system, we will reduce
them to required format so that we can achieve good results.

3.3.5 Dataset splitting

A dataset used for machine learning should be partitioned into three subsets — training,
test, and validation sets.

Training set: A data scientist uses a training set to train a model and define its optimal
parameters it has to learn from data.

Test set: A test set is needed for an evaluation of the trained model and its capability for
generalization. The latter means a model’s ability to identify patterns in new unseen data
after having been trained over a training data. It’s crucial to use different subsets for
training and testing to avoid model overfitting, which is the incapacity for generalization
we mentioned above.

Dept. of CSE, SVCE 2021-2022 Page | 20


MACHINE LEARNING WITH FLASK

3.3.6 Model training

After a data scientist has preprocessed the collected data and split it into train and test
can proceed with a model training. This process entails “feeding” the algorithm with
trainingdata. An algorithm will process data and output a model that is able to find a
target value (attribute) in new data an answer you want to get with predictive analysis.
The purpose of model training is to develop a model.

K Nearest Neighbor (KNN) Algorithm

The K Nearest Neighbors (KNN) algorithm measures the distance between a query
scenario and a set of scenarios in the data set. We can compute the distance between
two scenarios using some distance function d(x,y), where x,y are scenarios composed
of N features, such that x={x1,…,xN}, y={y1,…,yN} .

The Two distance functions are:


Absolute distance measuring:

Euclidian distance measuring

The model for KNN is the entire training dataset. When a prediction is required for a
unseen data instance, the KNN algorithm will search through the training dataset for
the k- most similar instances. The prediction attribute of the most similar instances is
summarized and returned as the prediction for the unseen instance.

The similarity measure is dependent on the type of data. For real-valued data, the
Euclidean distance can be used. Other types of data such as categorical or binary data,

Dept. of CSE, SVCE 2021-2022 Page | 21


MACHINE LEARNING WITH FLASK

hamming distance can be used.


Instance-based algorithms are those algorithms that model the problem using data
instances (or rows) in order to make predictive decisions. The KNN algorithm is an
extreme form of instance-based methods because all training observations are retained
as part
of the model.
It is a competitive learning algorithm, because it internally uses competition between
model elements (data instances) in order to make a predictive decision. The objective
similarity measure between data instances causes each data instance to compete to
“win” or be most similar to a given unseen data instance and contribute to a prediction.

Fig 3.3.6.1 Flow chart of KNN algorithm

Dept. of CSE, SVCE 2021-2022 Page | 22


MACHINE LEARNING WITH FLASK

3.3.7 Model evaluation and testing

The goal of this step is to develop the simplest model which is able to formulate a target
value fast and well enough. A data scientist can achieve this goal through model tuning.
That’s the optimization of model parameters to achieve an algorithm’s best
performance.

3.3.8 Accuracy

Classification accuracy is what we usually mean, when we use the term accuracy. It is
the ratio of number of correct predictions to the total number of input samples. In this
project the prediction was obtained from a KNN algorithm of customers attribute values.
The accuracy obtained is 80% the features can be tuned for more accuracy.

Dept. of CSE, SVCE 2021-2022 Page | 23


MACHINE LEARNING WITH FLASK

CHAPTER 4
RESULTS AND DISCUSSIONS

4.1 IMPLEMENTTION

Source Code
➢ Importing libraries

import numpy as np
import pandas as pd
import seaborn as sns
from sklearn. neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import
accuracy_score,classification_report,confusion_matrix

➢ Importing the dataset


customer_df=pd.read_csv(r"C:\Users\thejaswini
va\DataScience\customer_data.csv")
customer_df
payment_df=pd.read_csv(r"C:\Users\thejaswini
va\DataScience\payment_data.csv")
payment_df

➢ Data pre-processing
customer_df.columns
payment_df.columns
payment_df['id'].nunique()
customer_df['id'].nunique()
customer_df['fea_2'].fillna(customer_df['fea_2'].mean(),inplace=True)
payment_df['highest_balance'].fillna(0,inplace=True)
final_df=pd.merge(customer_df,payment_df,how='inner',on='id')
final_df

➢ Data visualization
import seaborn as sns
sns.countplot(k_df['report_date'],hue=k_df['label'], data=k_df)
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(14,7))
plt.subplot(121)
k_df["label"].value_counts().plot.pie(autopct = "%1.0f%%",colors =

Dept. of CSE, SVCE 2021-2022 Page | 24


MACHINE LEARNING WITH FLASK

sns.color_palette("prism",7),startangle = 60,labels=["0","1"],
wedgeprops={"linewidth":2,"edgecolor":"k"},explode=[.1,0],shadow =True)
plt.title("Distribution of Target variable")
plt.subplot(122)
ax = k_df["label"].value_counts().plot(kind="barh")
for i,j in enumerate(k_df["label"].value_counts().values):
ax.text(.7,i,j,weight = "bold",fontsize=20)
plt.title("Count of Traget variable")
plt.show()

➢ Splitting of dataset
x=k_df[['fea_2', 'fea_4', 'fea_8', 'fea_10',
'fea_11', 'OVD_t1', 'OVD_t2',
'OVD_sum', 'prod_code','pay_normal','prod_limit' ,
'new_balance', 'highest_balance']]
x.shape
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4,stratify=y,
random_state = 1234)
print(x_train.shape,x_test.shape,y_train.shape,y_test.shape)

➢ Training the model with KNN algorithm


knn=KNeighborsClassifier()
knn.fit(x_train,y_train)
y_pred=knn.predict(x_test)
print(y_pred)
confusion_matrix(y_pred,y_test)
classification_report(y_pred,y_test)
acc=accuracy_score(y_pred,y_test)
print(acc*100)

➢ Creating the pickle file


!pip install Flask
from flask import Flask
import pickle
filename=open("customerpaymentriskflask.pkl","wb")
pickle.dump(knn,filename)

➢ Flask frontend code


from flask import Flask, render_template, request
import pickle
app1=Flask(__name__)
file=open("customerpaymentriskflask.pkl", 'rb')

rf=pickle.load(file)
@app1.route("/",methods=['GET'])
def home():

Dept. of CSE, SVCE 2021-2022 Page | 25


MACHINE LEARNING WITH FLASK

return render_template("index1.html")
@app1.route('/predict',methods=['POST'])
def predict():
a=float(request.form['fea_2'])
b=int(request.form['fea_4'])
c=int(request.form['fea_8'])
d=int(request.form['fea_10'])
e=float(request.form['fea_11'])
f=int(request.form['OVD_t1'])
g=int(request.form['OVD_t2'])
h=int(request.form['OVD_sum'])
i=int(request.form['prod_code'])
j=int(request.form['pay_normal'])
k=float(request.form['prob_limit'])
l=float(request.form['new_balance'])
m=float(request.form['highest_balance'])
y_pred=rf.predict([[a,b,c,d,e,f,g,h,i,j,k,l,m]])
if(y_pred==1):
return render_template("index1.html",prediction_value="The
customer is in low credit risk")
else:
return render_template("index1.html",prediction_value="The
customer is in high credit risk")
print(y_pred)
if __name__=="__main__":
app1.run(debug=True)

Dept. of CSE, SVCE 2021-2022 Page | 26


MACHINE LEARNING WITH FLASK

4.2 SCREEN SHOTS

Fig 4.2.1 The web page for entering the values

Fig 4.2.2 Predicting the customer at high risk

Dept. of CSE, SVCE 2021-2022 Page | 27


MACHINE LEARNING WITH FLASK

Fig 4.2.3 Predicting the customer at low risk

Dept. of CSE, SVCE 2021-2022 Page | 28


MACHINE LEARNING WITH FLASK

CONCLUSION

These five weeks of internship at LIVEWIRE, has helped overall understanding and
providing us an insight on Python and Machine Learning. The works carried out during
internship focuses on writing a Python code for various logics and problem statements. We
understood Machine Learning using Python and developed a mini project on credit risk
prediction by using Machine learning techniques. This internship at Livewire has helped
in overall personality development by interaction with many members. It has helped with
integrating conceptual knowledge with real life applications. It provided the working
experience with real life professionals which will certainly help us in our career ahead.

Dept. of CSE, SVCE 2021-2022 Page | 29


MACHINE LEARNING WITH FLASK

REFERENCES
➢ Assef, Fernanda; Steiner, Maria Teresinha; Steiner Neto, Pedro Jose; Franco, David
Gabriel de Barros (2019). Classification Algorithms in Financial Application: Credit
Risk Analysis on Legal Entities. IEEE Latin America Transactions, 17(10), 1733–
1740. doi:10.1109/TLA.2019.8986452
➢ E. Khandani, A. J. Kim, and A. W. Lo, “Consumer credit-risk models via machine-
learning algorithms,” Journal of Banking & Finance, vol. 34, no. 11, pp. 2767–2787,
2010.
➢ S. Bhatia, P. Sharma, R. Burman, S. Hazari, and R. Hande, “Credit scoring using
machine learning techniques,” International Journal of Computer Applications, vol.
161, no. 11, pp. 1–4, 2017.
➢ S. Piramuthu, “On preprocessing data for financial credit risk evaluation,” Expert
Systems with Applications, vol. 30, no. 3, pp. 489–497, 2006.
➢ Abedin MZ, Guotai C, Colombage S, Moula FE (2018) Credit default prediction
using a support vector machine and a probabilistic neural network. J Credit Risk
14(2):1–27

Dept. of CSE, SVCE 2021-2022 Page | 30

You might also like