Naive Bayes Algorithm: A Complete guide for Data Science Enthusiasts

 Naive Bayes Algorithm

In this article, we will discuss the mathematical intuition behind Naive Bayes Classifiers, an d we’ll also see how to implement this on Python.Also We Will Discuss about Naive Bayes algorithm in Machine Learning

This model is easy to build and is mostly used for large datasets. It is a probabilistic machine learning model that is used for classification problems. The core of the classifier depends on the Naive Bayes theorem with an assumption of independence among predictors. That means changing the value of a feature doesn’t change the value of another feature.

Why is it called Naive?

It is called Naive because of the assumption that 2 variables are independent when they may not be. In a real-world scenario, there is hardly any situation where the features are independent.

Naive Bayes does seem to be a simple yet powerful algorithm. But why is it so popular?

Since it is a probabilistic approach, the predictions can be made real quick. It can be used for both binary and multi-class classification problems.

Before we dive deeper into this topic we need to understand what is “Conditional probability”, what is “Bayes’ theorem” and how conditional probability help’s us in Bayes’ theorem.

This article was published as a part of the Data Science Blogathon

Table of contents

What is Naive Bayes Algorithm?

The Naive Bayes algorithm is a popular and simple classification algorithm used in machine learning. It works by calculating the probability of an item belonging to a certain class based on its features.

Naive Bayes Algorithm in Machine Learning

Naive Bayes is a simple but powerful method in machine learning used for guessing categories of things. Imagine sorting emails into spam or inbox. Naive Bayes looks at each word (like a clue) and predicts how likely it is to be spam based on past emails. It assumes these words aren’t connected (not always true!), but it’s fast and works well, making it a popular choice for many tasks.sharemore_vert

Conditional Probability for Naive Bayes

Conditional probability is defined as the likelihood of an event or outcome occurring, based on the occurrence of a previous event or outcome. Conditional probability is calculated by multiplying the probability of the preceding event by the updated probability of the succeeding, or conditional, event.

Let’s start understanding this definition with examples.

Suppose I ask you to pick a card from the deck and find the probability of getting a king given the card is clubs.

Observe carefully that here I have mentioned a condition that the card is clubs.

Now while calculating the probability my denominator will not be 52, instead, it will be 13 because the total number of cards in clubs is 13.

Since we have only one king in clubs the probability of getting a KING given the card is clubs will be 1/13 = 0.077.

Let’s take one more example,

Consider a random experiment of tossing 2 coins. The sample space here will be:

If a person is asked to find the probability of getting a tail his answer would be 3/4 = 0.75

Now suppose this same experiment is performed by another person but now we give him the condition that both the coins should have heads. This means if event A: ‘Both the coins should have heads’, has happened then the elementary outcomes could not have happened. Hence in this situation, the probability of getting heads on both the coins will be 1/4 = 0.25

From the above examples, we observe that the probability may change if some additional information is given to us. This is exactly the case while building any machine learning model, we need to find the output given some features.

Mathematically, the conditional probability of event A given event B has already happened is given by:

conditional probability | Naive Bayes Algorithm

bayes rule

formula | Naive Bayes Algorithm

Bayes rule use

bayes rule for multiple X | Naive Bayes Algorithm

The n number of X

representation | Naive Bayes Algorithm

equation

Final formula | Naive Bayes Algorithm

example dataset

Assumptions of Naive Bayes

· All the variables are independent. That is if the animal is Dog that doesn’t mean that Size will be Medium

· All the predictors have an equal effect on the outcome. That is, the animal being dog does not have more importance in deciding If we can pet him or not. All the features have equal importance.

We should try to apply the Naive Bayes Classifier formula on the above dataset however before that, we need to do some precomputations on our dataset.

We need to find P(xi|yj) for each xi in X and each yj in Y. All these calculations have been demonstrated below:

assumptions | Naive Bayes Algorithm

probabilities | Naive Bayes Algorithm

Now if we send our test data, suppose test = (Cow, Medium, Black)

Probability of petting an animal :

Probability of petting an animal

Probability of petting an animal value

probability of not petting an animal | Naive bayes algorithm

value

normalize the result | probability of not petting an animal

We see here that P(Yes|Test) > P(No|Test), so the prediction that we can pet this animal is “Yes”.

Gaussian Naive Bayes

So far, we have discussed how to predict probabilities if the predictors take up discrete values. But what if they are continuous? For this, we need to make some more assumptions regarding the distribution of each feature. The different naive Bayes classifiers differ mainly by the assumptions they make regarding the distribution of P(xi | y). Here we’ll discuss Gaussian Naïve Bayes.

Gaussian Naïve Bayes is used when we assume all the continuous variables associated with each feature to be distributed according to Gaussian Distribution. Gaussian Distribution is also called Normal distribution.

The conditional probability changes here since we have different values now. Also, the (PDF) probability density function of a normal distribution is given by:

Gaussian naive bayes

We can use this formula to compute the probability of likelihoods if our data is continuous.

Endnotes

Naive Bayes algorithms are mostly used in face recognition, weather prediction, Medical Diagnosis, News classification, Sentiment Analysis, etc. In this article, we learned the mathematical intuition behind Naive bayes algorithm in Machine learning. You have already taken your first step to master this algorithm and from here all you need is practi

Frequently Asked Questions

Q1. Why is Naive Bayes algorithm used?

A. The Naive Bayes algorithm is used due to its simplicity, efficiency, and effectiveness in certain types of classification tasks. It’s particularly suitable for text classification, spam filtering, and sentiment analysis. It assumes independence between features, making it computationally efficient with minimal data. Despite its “naive” assumption, it often performs well in practice, making it a popular choice for various applications.

Q2. What is the Naive Bayes algorithm?

A. The Naive Bayes algorithm is a probabilistic classification technique based on Bayes’ theorem. It assumes that all features in the data are independent of each other, given the class label. It calculates the probability of a particular class for a given set of features and selects the class with the highest probability as the predicted class. It’s commonly used in text classification and spam filtering tasks.

Q3. What is Naive Bayes Classifier?

Naive Bayes is a simple yet powerful machine learning algorithm for classification. It uses Bayes theorem to predict the class of something (like spam or not spam) based on its features (like words in an email). It’s popular for its speed and accuracy, especially in text classification tasks.
pen_spark

Q4. What is naive Bayes classification algorithm in R?

Naive Bayes is a classification algorithm in R used for tasks like spam filtering. It predicts probabilities of an instance belonging to a class based on Bayes’ theorem.
Here’s the gist of implementing it:
Load libraries ( mlbench , caret , e1071 ).
Prepare data (load, explore, pre-process).
Split data into training and testing sets.
Build the model using naiveBayes() function.
Make predictions using predict() on the model and testing data.
Evaluate the model’s performance (accuracy, precision, etc.).pen_spark