We will start with supervised classification and self-driving cars are result of supervised classification. Similar to how humans learn computers learn the same way when doing machine learning. You give them a lot of examples and they start figuring out how it works.
When looking at new examples you need to figure out which parts we should be paying attention to. We call these parts as features.
In machine learning we give you many examples. These examples have many features/attributes which can be used to describe them. Need to pick the right features and if those features give you the right answers then you can use those features to classify new examples.
In machine learning we take features and try to produce labels.
From music we don't take raw music but we take attributes that we call features. e.g. intensity, tempo, genre etc.
Using these features we make 2 labels - like, don't like
Given a set of examples machine learning algorithms make a decision surface. This decision surface allows us to classify new points given previous examples.
What we want to do is to take all the training data and make a decision boundary like below.
In [1]:
import numpy as np
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
Y = np.array([1, 1, 1, 2, 2, 2])
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
clf.fit(X, Y)
print(clf.predict([[-0.8, -1], [4, 1]]))
fit and train are interchangeably used
In [2]:
prediction = clf.predict([[-0.8, -1]])
from sklearn.metrics import accuracy_score
print accuracy_score(prediction, [0.0])
In machine learning you need to train and test your algorithm on different data. If you use same data then it is possible to get 100% accuracy but that would fail with new data
Let' use cancer example. Say if we have a specific cancer C
P(C) = 0.01
Test:
Question:
In [7]:
P_C = 0.01
positive_when_cancer = (P_C * 0.9)
positive_when_no_cancer = 0.1
cancer_probability_given_test_positive = positive_when_cancer / (positive_when_cancer + positive_when_no_cancer)
cancer_probability_given_test_positive
Out[7]:
In essence we have some prior probability and then we get some evidence from the test. Based on this new information we get a posterior probability
So it incorporates evidence from test into prior probability to arrive at posterior probability
[## Why is Naive Bayes Naive
Because it does not consider word order but just the frequency of words
Pros
Cons