Numpy

http://numpy.org

High-performance vector, matrix and higher-dimensional data structures for Python. A core data type provider for the scientific python stack.

For a comprehensive tutorial, refer to this material:

http://wiki.scipy.org/Tentative_NumPy_Tutorial



In [3]:
import numpy as np

Q. What is an array in numpy? An compact representation of like-kind data.


In [4]:
array_data = [1, 2, 3, 4]

In [5]:
array = np.array(array_data)
array


Out[5]:
array([1, 2, 3, 4])

In [6]:
array.dtype


Out[6]:
dtype('int32')

Numpy has a large number of methods that translate to matlab/R/IDL commands. (ufuncs)


In [7]:
array.sum()


Out[7]:
10

In [8]:
array.min(), array.mean(), array.max()


Out[8]:
(1, 2.5, 4)

In [9]:
np.median(array)


Out[9]:
2.5

Q. How do you index data quickly with numpy? Fancy indexing.


In [10]:
matrix = np.random.randint(0, 100, (5, 5))
matrix


Out[10]:
array([[68, 30, 68, 51, 96],
       [92, 99, 79, 55, 92],
       [91, 55,  2, 22, 60],
       [24, 62,  5, 73, 44],
       [87, 23, 82, 16, 35]])

In [11]:
matrix.shape


Out[11]:
(5, 5)

In [12]:
matrix > 50


Out[12]:
array([[ True, False,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True, False, False,  True],
       [False,  True, False,  True, False],
       [ True, False,  True, False, False]], dtype=bool)

Fancy indexing can be used to slice into data in complex ways (boolean expressions) and form views of data.



In [13]:
matrix[matrix>50]


Out[13]:
array([68, 68, 51, 96, 92, 99, 79, 55, 92, 91, 55, 60, 62, 73, 87, 82])

Q. What if the coordinates are important? nonzero


In [14]:
rows, cols = np.nonzero(matrix>50)

In [15]:
matrix[rows, cols]


Out[15]:
array([68, 68, 51, 96, 92, 99, 79, 55, 92, 91, 55, 60, 62, 73, 87, 82])

Broadcasting is a way to perform operations on data quickly avoiding iteration.



In [16]:
matrix = np.ones((3,3))
matrix


Out[16]:
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [17]:
5.0 * matrix


Out[17]:
array([[ 5.,  5.,  5.],
       [ 5.,  5.,  5.],
       [ 5.,  5.,  5.]])

Q. Multiply a matrix by a row of data, without iterating through the matrix rows.


In [18]:
np.r_[[1, 2, 3]] * matrix


Out[18]:
array([[ 1.,  2.,  3.],
       [ 1.,  2.,  3.],
       [ 1.,  2.,  3.]])

Q. Multiply a matrix by a column of data, without iterating through the matrix columns.


In [19]:
np.c_[[1, 2, 3]] * matrix


Out[19]:
array([[ 1.,  1.,  1.],
       [ 2.,  2.,  2.],
       [ 3.,  3.,  3.]])

Numpy allows operations to happen at a lower level without iteration, a process called vectorization.



In [ ]: