Homework #3.2: One vs. Rest
This homework is about multinomial classification using OVR technique. Do the following:
- Get iris dataset from
sklearn
using sklearn.datasets.load_iris()
.
- There are 3 classes (labels): 0, 1 and 2. Peek class 0 and convert all of the zeros in $ y $ vector to one. Convert the rest labels (1s and 2s) to zero.
- Create a
LogisticRegression
model and fit it on $ X $ and the new $ y $. Get predictions (the probabilities) from the model with .predict_proba()
function.
- Do steps 2 and 3 once again for class 1 and once for class 2.
- Now you have 3 probabilities: one for class 0, one for 1 and one for 2. The real class for each data item is the maximum probability among its 3 probability. For example if data item number 5 has 0.1 probability for class 0, 0.7 for class 1 and 0.4 for class 2, its true class is 1.
- Draw a 2D plot using
matplotlib
to show that the OVR classification works well.
For more information:
scikit-learn example about OVR
(Note: in this example, it has used LogisticRegression
once for OVR but you should use it 3 times)
OVR - Wikipedia
Coursera Machine Learning Course - Andrew Ng
Good luck!