This is a classic dataset that contains four measurements for 150 different iris flowers). Three species are represented, 50 flowers from each. There are no missing values. The following fields are present:
The following code shows 10 sample rows.
In [3]:
import pandas as pd
import numpy as np
path = "./data/"
filename = os.path.join(path,"auto-mpg.csv")
df = pd.read_csv(filename,na_values=['NA','?'])
# Shuffle
np.random.seed(42)
df = df.reindex(np.random.permutation(df.index))
df.reset_index(inplace=True, drop=True)
df[0:10]
Out[3]:
In [ ]: