Naive Bayes
Last updated
Last updated
Naive Bayes is algorithm to find decision surface. Naive Bayes is good for text learning. See an example below. Who wrote the email that contains Love and Life words? It si Sara. If words would be Life and Deal, it would be Chris. Why is it called Naive? Because it is calculated based on probabilities (e.g. word frequency).
Here is another example with easier data to understand. We have drawn a line that divides two sets of data points (based on what Gaussian Naive Bayes would do). When we add a new point on chart in order to predict into what category it falls, it will be into the surface below the line.
Have a look at this code (training data and labels) and then check the chart and the result of prediction.
Out put is following. So, we can see that coordinates [-0.8, -1] fall into category 1.
We should verify what is the accuracy of our algorithm usage. For the first example, we will create features and labels. Then we create test data. Then we calculate accuracy. For the following example, the output will be 1.0 (which is 100% accuracy). It is because we used the same data points for testing we used for learning.
We should tease the algorithm and give it features and labels that were not used for learning. Like here.
For this one, the accuracy is 0.5 because we said that the first data point is 1 but actually, it is falling into category 2.