Numerical Python, or "NumPy" for short, is a foundational package on which many of the most common data science packages are built. Numpy provides us with high performance multi-dimensional arrays which we can use as vectors or matrices.
The key features of numpy are:
Additional Recommended Resources:
In this brief tutorial, I will demonstrate some of the common NumPy operations you will see during the rest of the week.
In [ ]:
import numpy as np
from __future__ import print_function
A common habit is to import under the np namespace as you will often find yourself typing numpy a lot otherwise. Two letters is easier on your fingers and your computer.
In [ ]:
np.arange(-1.0, 1.0, 0.1)
In [ ]:
print(np.random.randint(0, 5, size=10))
print(np.ones(10))
print(np.zeros(10))
In [ ]:
rank1_array = np.array([3, 33, 333])
print(type(rank1_array))
print(rank1_array.shape)
print(rank1_array.size)
print(rank1_array.dtype)
print(rank1_array[0], rank1_array[1], rank1_array[2])
print(rank1_array[:], rank1_array[1:], rank1_array[:2])
In [ ]:
np.ones((10,2)) # 10 rows, 2 columns
In [ ]:
np.zeros((2,10)) # 2 columns, 10 rows
In [ ]:
np.eye(10,10)*3 # diagonal of 1s but multiplied by 3
In [ ]:
rank2_array = np.array([[11,12,13],[21,22,23],[31,32,33]])
print(type(rank2_array))
print(rank2_array.shape)
print(rank2_array.size)
print(rank2_array.dtype)
print(rank2_array[0], rank2_array[1], rank2_array[2])
In [ ]:
print(rank2_array[:]) # print everything in array
In [ ]:
print(rank2_array[1:]) # slice from 2nd row and on
In [ ]:
print(rank2_array[:,0]) # all rows, but 1st column
In [ ]:
print(rank2_array[:,1]) # all rows, but 2nd column
In [ ]:
print(rank2_array[:,2]) # all rows, but 3rd column
In [ ]:
print(rank2_array[0,1]) # i=0, j=1 of the 3x3 matrix we just made
In [ ]:
np.random.randint(0, 5, (2,5,5)) # 2 x 5 x 5 [3D matrix!]
In [ ]:
np.random.randint(0, 5, (2,5,5)).shape
In [ ]:
np.arange(72).reshape(3,24)
In [ ]:
np.arange(72).reshape(24,3).T # tranpose; this is not the same as above! beware
Note that the transpose is just ndarray().T. But remember, things are not always what they seem. The above two examples have the exact same dimensionality -- but the reshaping will slice up the vector in different ways! Be careful!
In [ ]:
np.arange(72).reshape(3, 2, -1) # -1 means to let NumPy figure out the size of the remaining dimension
In [ ]:
np.arange(72).reshape(3, -1, 12) # -1 means to let NumPy figure out the size of the remaining dimension
In [ ]:
np.arange(36).reshape(6, 6)
We can even combine multiple indices with Python slicing!
In [ ]:
np.arange(36).reshape(6,6)[2:4,:3]
In [ ]:
unfiltered_arr = np.arange(72).reshape(3, -1, 12)
unfiltered_arr
In [ ]:
condition = unfiltered_arr % 3 == 0 # divisible by 3
condition # this is a bitmask!
In [ ]:
unfiltered_arr[condition] # this creates a view (subset) of the original array, not a copy
In [ ]:
unfiltered_arr[condition] = 0 # only change the values matching the condition
unfiltered_arr
In [ ]:
unfiltered_arr.reshape(-1) # flatten it back!