Homework #3.2: One vs. Rest

This homework is about multinomial classification using OVR technique. Do the following:

  1. Get iris dataset from sklearn using sklearn.datasets.load_iris().
  2. There are 3 classes (labels): 0, 1 and 2. Peek class 0 and convert all of the zeros in $ y $ vector to one. Convert the rest labels (1s and 2s) to zero.
  3. Create a LogisticRegression model and fit it on $ X $ and the new $ y $. Get predictions (the probabilities) from the model with .predict_proba() function.
  4. Do steps 2 and 3 once again for class 1 and once for class 2.
  5. Now you have 3 probabilities: one for class 0, one for 1 and one for 2. The real class for each data item is the maximum probability among its 3 probability. For example if data item number 5 has 0.1 probability for class 0, 0.7 for class 1 and 0.4 for class 2, its true class is 1.
  6. Draw a 2D plot using matplotlib to show that the OVR classification works well.

For more information:
scikit-learn example about OVR (Note: in this example, it has used LogisticRegression once for OVR but you should use it 3 times)

OVR - Wikipedia

Coursera Machine Learning Course - Andrew Ng

Good luck!