ML Unit No.4 Naïve Bayes Classifiers PPT Notes
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
Bayes Theorem
Prof.Sachin S. Patil
D.Y.Patil University Ambi,Pune
Bayes Theorem
Bayes' Theorem states that the conditional probability of an event,
based on the occurrence of another event, is equal to the likelihood
of the second event given the first event multiplied by the probability
of the first event.
Bayes Theorem
• Naïve Bayes algorithm is a supervised learning algorithm, which is
based on Bayes theorem and used for solving classification
problems.
• It is mainly used in text classification that includes a high-
dimensional training dataset.
• Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine
learning models that can make quick predictions.
Bayes Theorem
• It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
• https://codinginfinite.com/naive-bayes-classification-numerical-example/
Why is it called Naïve Bayes?
• The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which
can be described as:
• Such as if the fruit is identified on the bases of color, shape, and taste, then red,
spherical, and sweet fruit is recognized as an apple.
Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
Bayes Theorem
Bayes Theorem
Bayes Theorem
• P(A) represents the earlier probability that event A will take place.
Outlook Play
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
Frequency table for the Weather Conditions:
Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 5
Frequency table for the Weather Conditions:
Likelihood table weather condition:
Weather No Yes
Overcast 0 5 5/14= 0.35
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
All 4/14=0.29 10/14=0.71
Applying Bayes'theorem:
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes) / P(Sunny)
P(Sunny|Yes)= 3/10= 0.3
P(Sunny)= 0.35
P(Yes)=0.71
So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
So as we can see from the above calculation
that P(Yes|Sunny)>P(No|Sunny)
Hence on a Sunny day, Player can play the game.
Applying Bayes'theorem:
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
that P(Yes|Sunny)>P(No|Sunny)
Advantages of Naïve Bayes Classifier:
• Spam detection, medical diagnosis, picture recognition, and natural
language processing
• Spam detection
• Spam detection is among the machine learning techniques where the
Bayes Theorem is most frequently used. Machine learning algorithms
may precisely detect undesired emails and block them from reaching a
user's mailbox by calculating the likelihood that a message is spam
using the Bayes Theorem.
Advantages of Naïve Bayes Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a
class of datasets.
# comparing actual response values (y_test) with predicted response values (y_pred)
from sklearn import metrics
print("Gaussian Naive Bayes model accuracy(in %):", metrics.accuracy_score(y_test, y_pred)*100)
Naïve Bayes in Scikit- learn
# Fitting Naive Bayes to the Training set
from sklearn.naive_bayes import GaussianNB % Import GaussianNB from naïve_bayes
classifier = GaussianNB() % Create object classifier of function GaussianNB()
classifier.fit(x_train, y_train) % GaussianNB classifier to fit it to the training dataset.
# Predicting the Test set results
y_pred = classifier.predict(x_test)
https://iq.opengenus.org/bernoulli-naive-bayes/
Que . Using Bernoulli's Naïve Bayes , Find Probability of Buys computer yes or no for
the instance X =(age =youth, income =medium, student =yes, credit rating =fair) and
we need to predict its class label (yes or no)
Solution
• P(C1)=P(buys_computer = yes) =9/14 =0.643 (since total 9 rows of yes)
=0.222*0.444*0.667*0.667 =0.044.
buys_computer = yes for instance X (age =youth, income =medium, student =yes, credit
rating =fair)
Multinomial Naïve Bayes
• Feature vectors represent the frequencies with which certain events
have been generated by a multinomial distribution. This is the event
model typically used for document classification.
• Application –
https://www.upgrad.com/blog/multinomial-naive-bayes-explained/
Multinomial Naïve Bayes
• Application –
#Accuracy score
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
Gaussian Naïve Bayes
• Good Example
• https://www.youtube.com/watch?v=kufuBE6TJew
• https://www.youtube.com/watch?v=kufuBE6TJew
• https://levelup.gitconnected.com/classification-using-gaussian-naive-
bayes-from-scratch-6b8ebe830266
Guassian Naïve Bayes using sklearn
# fitting naive bayes to the training set
from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB();
classifier.fit(X_train, y_train)