This is a classic dataset that contains four measurements for 150 different iris flowers). Three species are represented, 50 flowers from each. There are no missing values. The following fields are present:
The following code shows 10 sample rows.
In [1]:
import pandas as pd
path = "./data/"
filename = os.path.join(path,"iris.csv")
df = pd.read_csv(filename,na_values=['NA','?'])
# Shuffle lines
np.random.seed(42) # Uncomment this line to get the same shuffle each time
df = df.reindex(np.random.permutation(df.index))
df.reset_index(inplace=True, drop=True)
df[0:10]
Out[1]:
In [ ]: