Tutorial Brief

numpy is a powerful set of tools to perform mathematical operations of on lists of numbers. It works faster than normal python lists operations and can manupilate high dimentional arrays too.

Finding Help:

SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering.

http://www.scipy.org/

So NumPy is a part of a bigger ecosystem of libraries that build on the optimized performance of NumPy NDArray.

It contain these core packages:

	NumPy Base N-dimensional array package		SciPy Fundamental library for scientific computing		Matplotlib Comprehensive 2D Plotting
	IPython Enhanced Interactive Console		SymPy Symbolic mathematics		Pandas Data structures & analysis

Importig the library

Import numpy library as np

This helps in writing code and it's almost a standard in scientific work



In [18]:

    
import numpy as np

Working with ndarray

We will generate an ndarray with np.arange method.

np.arange([start,] stop[, step,], dtype=None)



In [19]:

    
np.arange(10)









    Out[19]:





array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])



In [20]:

    
np.arange(1,10)









    Out[20]:





array([1, 2, 3, 4, 5, 6, 7, 8, 9])



In [21]:

    
np.arange(1,10, 0.5)









    Out[21]:





array([ 1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,  5.5,  6. ,
        6.5,  7. ,  7.5,  8. ,  8.5,  9. ,  9.5])



In [22]:

    
np.arange(1,10, 3)









    Out[22]:





array([1, 4, 7])



In [23]:

    
np.arange(1,10, 2, dtype=np.float64)









    Out[23]:





array([ 1.,  3.,  5.,  7.,  9.])

Examining ndrray



In [24]:

    
ds = np.arange(1,10,2)
ds.ndim









    Out[24]:





1



In [25]:

    
ds.shape









    Out[25]:





(5,)



In [26]:

    
ds.size









    Out[26]:





5



In [27]:

    
ds.dtype









    Out[27]:





dtype('int32')



In [28]:

    
ds.itemsize









    Out[28]:





4



In [29]:

    
x=ds.data
list(x)









    Out[29]:





['\x01',
 '\x00',
 '\x00',
 '\x00',
 '\x03',
 '\x00',
 '\x00',
 '\x00',
 '\x05',
 '\x00',
 '\x00',
 '\x00',
 '\x07',
 '\x00',
 '\x00',
 '\x00',
 '\t',
 '\x00',
 '\x00',
 '\x00']



In [30]:

    
ds









    Out[30]:





array([1, 3, 5, 7, 9])



In [31]:

    
# Memory Usage
ds.size * ds.itemsize









    Out[31]:





20

Why to use numpy?

We will compare the time it takes to create two lists and do some basic operations on them.

Generate a list



In [32]:

    
%%capture timeit_results
# Regular Python
%timeit python_list_1 = range(1,1000)
python_list_1 = range(1,1000)
python_list_2 = range(1,1000)

#Numpy
%timeit numpy_list_1 = np.arange(1,1000)
numpy_list_1 = np.arange(1,1000)
numpy_list_2 = np.arange(1,1000)



In [33]:

    
print timeit_results









    



100000 loops, best of 3: 15 us per loop
The slowest run took 5.51 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.25 us per loop



In [34]:

    
# Function to calculate time in seconds
def return_time(timeit_result):
    temp_time = float(timeit_result.split(" ")[5])
    temp_unit = timeit_result.split(" ")[6]
    if temp_unit == "ms":
        temp_time = temp_time * 1e-3
    elif temp_unit == "us":
        temp_time = temp_time * 1e-6
    elif temp_unit == "ns":
        temp_time = temp_time * 1e-9
    return temp_time



In [35]:

    
python_time = return_time(timeit_results.stdout.split("\n")[0])
numpy_time = return_time(timeit_results.stdout.split("\n")[1])

print "Python/NumPy: %.1f" % (python_time/numpy_time)









    



---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-35-1c0052a89b77> in <module>()
      1 python_time = return_time(timeit_results.stdout.split("\n")[0])
----> 2 numpy_time = return_time(timeit_results.stdout.split("\n")[1])
      3 
      4 print "Python/NumPy: %.1f" % (python_time/numpy_time)

<ipython-input-34-640f956687ba> in return_time(timeit_result)
      1 # Function to calculate time in seconds
      2 def return_time(timeit_result):
----> 3     temp_time = float(timeit_result.split(" ")[5])
      4     temp_unit = timeit_result.split(" ")[6]
      5     if temp_unit == "ms":

ValueError: could not convert string to float: times

Basic Operation



In [ ]:

    
%%capture timeit_python
%%timeit
# Regular Python
[(x + y) for x, y in zip(python_list_1, python_list_2)]
[(x - y) for x, y in zip(python_list_1, python_list_2)]
[(x * y) for x, y in zip(python_list_1, python_list_2)]
[(x / y) for x, y in zip(python_list_1, python_list_2)];



In [ ]:

    
print timeit_python



In [ ]:

    
%%capture timeit_numpy
%%timeit
#Numpy
numpy_list_1 + numpy_list_2
numpy_list_1 - numpy_list_2
numpy_list_1 * numpy_list_2
numpy_list_1 / numpy_list_2;



In [ ]:

    
print timeit_numpy



In [ ]:

    
python_time = return_time(timeit_python.stdout)
numpy_time = return_time(timeit_numpy.stdout)

print "Python/NumPy: %.1f" % (python_time/numpy_time)

Most Common Functions

List Creation

array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

Parameters
----------
object : array_like
    An array, any object exposing the array interface, an
    object whose __array__ method returns an array, or any
    (nested) sequence.
dtype : data-type, optional
    The desired data-type for the array.  If not given, then
    the type will be determined as the minimum type required
    to hold the objects in the sequence.  This argument can only
    be used to 'upcast' the array.  For downcasting, use the
    .astype(t) method.
copy : bool, optional
    If true (default), then the object is copied.  Otherwise, a copy
    will only be made if __array__ returns a copy, if obj is a
    nested sequence, or if a copy is needed to satisfy any of the other
    requirements (`dtype`, `order`, etc.).
order : {'C', 'F', 'A'}, optional
    Specify the order of the array.  If order is 'C' (default), then the
    array will be in C-contiguous order (last-index varies the
    fastest).  If order is 'F', then the returned array
    will be in Fortran-contiguous order (first-index varies the
    fastest).  If order is 'A', then the returned array may
    be in any order (either C-, Fortran-contiguous, or even
    discontiguous).
subok : bool, optional
    If True, then sub-classes will be passed-through, otherwise
    the returned array will be forced to be a base-class array (default).
ndmin : int, optional
    Specifies the minimum number of dimensions that the resulting
    array should have.  Ones will be pre-pended to the shape as
    needed to meet this requirement.



In [ ]:

    
np.array([1,2,3,4,5])

Multi Dimentional Array



In [ ]:

    
np.array([[1,2],[3,4],[5,6]])

zeros(shape, dtype=float, order='C') and ones(shape, dtype=float, order='C')

Parameters
----------
shape : int or sequence of ints
    Shape of the new array, e.g., ``(2, 3)`` or ``2``.
dtype : data-type, optional
    The desired data-type for the array, e.g., `numpy.int8`.  Default is
    `numpy.float64`.
order : {'C', 'F'}, optional
    Whether to store multidimensional data in C- or Fortran-contiguous
    (row- or column-wise) order in memory.



In [ ]:

    
np.zeros((3,4))



In [ ]:

    
np.zeros((3,4), dtype=np.int64)



In [ ]:

    
np.ones((3,4))

np.linspace(start, stop, num=50, endpoint=True, retstep=False)

Parameters
----------
start : scalar
    The starting value of the sequence.
stop : scalar
    The end value of the sequence, unless `endpoint` is set to False.
    In that case, the sequence consists of all but the last of ``num + 1``
    evenly spaced samples, so that `stop` is excluded.  Note that the step
    size changes when `endpoint` is False.
num : int, optional
    Number of samples to generate. Default is 50.
endpoint : bool, optional
    If True, `stop` is the last sample. Otherwise, it is not included.
    Default is True.
retstep : bool, optional
    If True, return (`samples`, `step`), where `step` is the spacing
    between samples.



In [ ]:

    
np.linspace(1,5)



In [ ]:

    
np.linspace(0,2,num=4)



In [ ]:

    
np.linspace(0,2,num=4,endpoint=False)

random_sample(size=None)

Parameters
----------
size : int or tuple of ints, optional
    Defines the shape of the returned array of random floats. If None
    (the default), returns a single float.



In [ ]:

    
np.random.random((2,3))



In [ ]:

    
np.random.random_sample((2,3))

Statistical Analysis



In [ ]:

    
data_set = np.random.random((2,3))
data_set

np.max(a, axis=None, out=None, keepdims=False)

Parameters
----------
a : array_like
    Input data.
axis : int, optional
    Axis along which to operate.  By default, flattened input is used.
out : ndarray, optional
    Alternative output array in which to place the result.  Must
    be of the same shape and buffer length as the expected output.
    See `doc.ufuncs` (Section "Output arguments") for more details.
keepdims : bool, optional
    If this is set to True, the axes which are reduced are left
    in the result as dimensions with size one. With this option,
    the result will broadcast correctly against the original `arr`.



In [ ]:

    
np.max(data_set)



In [ ]:

    
np.max(data_set, axis=0)



In [ ]:

    
np.max(data_set, axis=1)

np.min(a, axis=None, out=None, keepdims=False)



In [ ]:

    
np.min(data_set)

np.mean(a, axis=None, dtype=None, out=None, keepdims=False)



In [ ]:

    
np.mean(data_set)

np.median(a, axis=None, out=None, overwrite_input=False)



In [ ]:

    
np.median(data_set)

np.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False)



In [ ]:

    
np.std(data_set)

np.sum(a, axis=None, dtype=None, out=None, keepdims=False)



In [ ]:

    
np.sum(data_set)

Reshaping

np.reshape(a, newshape, order='C')



In [ ]:

    
np.reshape(data_set, (3,2))



In [ ]:

    
np.reshape(data_set, (6,1))



In [ ]:

    
np.reshape(data_set, (6))

np.ravel(a, order='C')



In [ ]:

    
np.ravel(data_set)

Slicing



In [ ]:

    
data_set = np.random.random((5,10))
data_set



In [ ]:

    
data_set[1]



In [ ]:

    
data_set[1][0]



In [ ]:

    
data_set[1,0]

Slicing a range



In [ ]:

    
data_set[2:4]



In [ ]:

    
data_set[2:4,0]



In [ ]:

    
data_set[2:4,0:2]



In [ ]:

    
data_set[:,0]

Stepping



In [ ]:

    
data_set[2:4:1]



In [ ]:

    
data_set[::]



In [ ]:

    
data_set[::2]



In [ ]:

    
data_set[2:4]



In [ ]:

    
data_set[2:4,::2]

Matrix Operations



In [1]:

    
import numpy as np
# Matrix A 
A = np.array([[1,2],[3,4]])
# Matrix B
B = np.array([[3,4],[5,6]])

Addition



In [ ]:

    
A+B

Subtraction



In [ ]:

    
A-B

Multiplication (Element by Element)



In [ ]:

    
A*B

Multiplication (Matrix Multiplication)



In [2]:

    
A.dot(B)









    Out[2]:





array([[13, 16],
       [29, 36]])

Division



In [ ]:

    
A/B

Square



In [ ]:

    
np.square(A)

Power



In [ ]:

    
np.power(A,3) #cube of matrix

Transpose



In [3]:

    
A.transpose()









    Out[3]:





array([[1, 3],
       [2, 4]])

Inverse



In [5]:

    
np.linalg.inv(A)









    Out[5]:





array([[-2. ,  1. ],
       [ 1.5, -0.5]])