Import numpy, pandas, matplotlib, and sklearn. Also set visualizations to be shown inline in the notebook.
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
Set Numpy's Random Seed to 101
In [2]:
np.random.seed(101)
Create a NumPy Matrix of 100 rows by 5 columns consisting of random integers from 1-100. (Keep in mind that the upper limit may be exclusive.
In [3]:
random_integers = np.random.randint(low = 1,
high = 101,
size = (100, 5))
Create a 2-D visualization using plt.imshow of the numpy matrix with a colorbar. Add a title to your plot. Bonus: Figure out how to change the aspect of the imshow() plot.
In [4]:
fig = plt.figure(figsize = (12, 12))
plt.imshow(random_integers, aspect = 0.05)
plt.colorbar()
plt.title("2D visualisation")
Out[4]:
Now use pd.DataFrame() to read in this numpy array as a dataframe. Simple pass in the numpy array into that function to get back a dataframe. Pandas will auto label the columns to 0-4
In [5]:
df = pd.DataFrame(random_integers)
df.head()
Out[5]:
Now create a scatter plot using pandas of the 0 column vs the 1 column.
In [6]:
df.plot(x = 0,
y = 1,
kind = 'scatter', figsize = (12, 8))
Out[6]:
Now scale the data to have a minimum of 0 and a maximum value of 1 using scikit-learn.
In [7]:
from sklearn.preprocessing import MinMaxScaler
In [8]:
minmax = MinMaxScaler()
In [9]:
scaled_random_int = minmax.fit_transform(df)
type(scaled_random_int)
Out[9]:
In [10]:
scaled_df = pd.DataFrame(scaled_random_int)
scaled_df.head()
Out[10]:
Using your previously created DataFrame, use df.columns = [...] to rename the pandas columns to be ['f1','f2','f3','f4','label']. Then perform a train/test split with scikitlearn.
In [11]:
from sklearn.model_selection import train_test_split
In [12]:
df.columns = ['f1','f2','f3','f4','label']
df.head()
Out[12]:
In [13]:
X = df.iloc[:, df.columns != 'label']
Y = df['label']
In [14]:
X.shape
Out[14]:
In [15]:
Y.shape
Out[15]:
In [16]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
test_size = 0.3,
random_state = 101)
In [17]:
X_train.shape
Out[17]:
In [18]:
X_test.shape
Out[18]:
In [19]:
Y_train.shape
Out[19]:
In [20]:
Y_test.shape
Out[20]: