Exercise: "Human learning" with iris data

Question: Can you predict the species of an iris using petal and sepal measurements?

  1. Read the iris data into a Pandas DataFrame, including column names.
  2. Gather some basic information about the data.
  3. Use sorting, split-apply-combine, and/or visualization to look for differences between species.
  4. Write down a set of rules that could be used to predict species based on iris measurements.

BONUS: Define a function that accepts a row of data and returns a predicted species. Then, use that function to make predictions for all existing rows of data, and check the accuracy of your predictions.


In [1]:
import pandas as pd
import matplotlib.pyplot as plt

# display plots in the notebook
%matplotlib inline

# increase default figure and font sizes for easier viewing
plt.rcParams['figure.figsize'] = (8, 6)
plt.rcParams['font.size'] = 14

Task 1

Read the iris data into a pandas DataFrame, including column names.


In [1]:
# define a list of column names (as strings)
col_names = 

# define the URL from which to retrieve the data (as a string)
url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

# retrieve the CSV file and add the column names
iris =

In [ ]:
# observe first five rows of data

Task 2

Gather some basic information about the data.


In [ ]:

Task 3

Use sorting, split-apply-combine, and/or visualization to look for differences between species.


In [ ]:

sorting


In [ ]:

split-apply-combine


In [ ]:

visualization


In [ ]:

Task 4

Write down a set of rules that could be used to predict species based on iris measurements.


In [ ]:

Bonus

Define a function that accepts a row of data and returns a predicted species. Then, use that function to make predictions for all existing rows of data, and check the accuracy of your predictions.


In [ ]: