# NumPy: creating and manipulating numerical data

(See Ch. 3, "Python Scientific lecture notes", Release 2012.3 (EuroScipy 2012)

by the EuroScipy tutorial team

Editors: Valentin Haenel, Emmanuelle Gouillart, Gaël Varoquaux

## The numpy array object

Python has built-in:

• containers: lists (costless insertion and append), dictionaries (fast lookup)
• high-level number objects: integers, floating point

Numpy is:

• extension package to Python for multi-dimensional arrays
• closer to hardware (efficiency)
• designed for scientific computation (convenience)
``````

In [5]:

import numpy as np
a = np.array([0, 1, 2, 3, 4])
a
a = np.arange(5)
a

``````
``````

Out[5]:

array([0, 1, 2, 3, 4])

``````

For example...

...an array containing:

• values of an experiment/simulation at discrete time steps
• signal recorded by a measurement device, e.g. sound wave
• pixels of an image, grey-level or colour
• 3-D data measured at different X-Y-Z positions, e.g. MRI scan
• ...

Why is it useful?

efficient numerical operations

``````

In [6]:

l = range(1000)
l**2
%timeit [i**2 for i in l]

``````
``````

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-7aed15c9ab3b> in <module>()
1 l = range(1000)
----> 2 l**2
3 get_ipython().magic(u'timeit [i**2 for i in l]')

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

``````
``````

In [8]:

a = np.arange(10)
%timeit a**2
a**2

``````
``````

1000000 loops, best of 3: 472 ns per loop

Out[8]:

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

``````

Reference documentation:

``````

In [9]:

help(np.array)

``````
``````

Help on built-in function array in module numpy.core.multiarray:

array(...)
array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

Create an array.

Parameters
----------
object : array_like
An array, any object exposing the array interface, an
object whose __array__ method returns an array, or any
(nested) sequence.
dtype : data-type, optional
The desired data-type for the array.  If not given, then
the type will be determined as the minimum type required
to hold the objects in the sequence.  This argument can only
be used to 'upcast' the array.  For downcasting, use the
.astype(t) method.
copy : bool, optional
If true (default), then the object is copied.  Otherwise, a copy
will only be made if __array__ returns a copy, if obj is a
nested sequence, or if a copy is needed to satisfy any of the other
requirements (`dtype`, `order`, etc.).
order : {'C', 'F', 'A'}, optional
Specify the order of the array.  If order is 'C' (default), then the
array will be in C-contiguous order (last-index varies the
fastest).  If order is 'F', then the returned array
will be in Fortran-contiguous order (first-index varies the
fastest).  If order is 'A', then the returned array may
be in any order (either C-, Fortran-contiguous, or even
discontiguous).
subok : bool, optional
If True, then sub-classes will be passed-through, otherwise
the returned array will be forced to be a base-class array (default).
ndmin : int, optional
Specifies the minimum number of dimensions that the resulting
array should have.  Ones will be pre-pended to the shape as
needed to meet this requirement.

Returns
-------
out : ndarray
An array object satisfying the specified requirements.

--------
empty, empty_like, zeros, zeros_like, ones, ones_like, fill

Examples
--------
>>> np.array([1, 2, 3])
array([1, 2, 3])

Upcasting:

>>> np.array([1, 2, 3.0])
array([ 1.,  2.,  3.])

More than one dimension:

>>> np.array([[1, 2], [3, 4]])
array([[1, 2],
[3, 4]])

Minimum dimensions 2:

>>> np.array([1, 2, 3], ndmin=2)
array([[1, 2, 3]])

Type provided:

>>> np.array([1, 2, 3], dtype=complex)
array([ 1.+0.j,  2.+0.j,  3.+0.j])

Data-type consisting of more than one element:

>>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])
>>> x['a']
array([1, 3])

Creating an array from sub-classes:

>>> np.array(np.mat('1 2; 3 4'))
array([[1, 2],
[3, 4]])

>>> np.array(np.mat('1 2; 3 4'), subok=True)
matrix([[1, 2],
[3, 4]])

``````
``````

In [10]:

np.array?

``````
Type: builtin_function_or_method String Form: Docstring: array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0) Create an array. Parameters ---------- object : array_like An array, any object exposing the array interface, an ... dtype : data-type, optional The desired data-type for the array. If not given, then ... copy : bool, optional If true (default), then the object is copied. Otherwise, a copy ...
• Looking for something:
``````

In [11]:

np.lookfor('create array')

``````
``````

Search results for 'create array'
---------------------------------
numpy.array
Create an array.
numpy.memmap
Create a memory-map to an array stored in a *binary* file on disk.
numpy.diagflat
Create a two-dimensional array with the flattened input as a diagonal.
numpy.fromiter
Create a new 1-dimensional array from an iterable object.
numpy.partition
Return a partitioned copy of an array.
numpy.ma.diagflat
Create a two-dimensional array with the flattened input as a diagonal.
numpy.ctypeslib.as_array
Create a numpy array from a ctypes array or a ctypes POINTER.
Create a boolean mask from an array.
numpy.ctypeslib.as_ctypes
Create and return a ctypes object from a numpy array.  Actually
numpy.ma.mrecords.fromarrays
Creates a mrecarray from a (flat) list of masked arrays.
numpy.lib.format.open_memmap
Open a .npy file as a memory-mapped array.
Create a new masked array from scratch.
numpy.lib.arrayterator.Arrayterator
Buffered iterator for big arrays.
numpy.ma.mrecords.fromtextfile
Creates a mrecarray from data stored in the file `filename`.
numpy.asarray
Convert the input to an array.
numpy.ndarray
ndarray(shape, dtype=float, buffer=None, offset=0,
numpy.recarray
Construct an ndarray that allows field access using attributes.
numpy.chararray
chararray(shape, itemsize=1, unicode=False, buffer=None, offset=0,
numpy.sum
Sum of array elements over a given axis.
numpy.asanyarray
Convert the input to an ndarray, but pass ndarray subclasses through.
numpy.copy
Return an array copy of the given object.
numpy.diag
Extract a diagonal or construct a diagonal array.
Load arrays or pickled objects from ``.npy``, ``.npz`` or pickled files.
numpy.sort
Return a sorted copy of an array.
numpy.array_equiv
Returns True if input arrays are shape consistent and all elements equal.
numpy.dtype
Create a data type object.
numpy.choose
Construct an array from an index array and a set of arrays to choose from.
numpy.nditer
Efficient multi-dimensional iterator object to iterate over arrays.
numpy.swapaxes
Interchange two axes of an array.
numpy.full_like
Return a full array with the same shape and type as a given array.
numpy.ones_like
Return an array of ones with the same shape and type as a given array.
numpy.empty_like
Return a new array with the same shape and type as a given array.
numpy.zeros_like
Return an array of zeros with the same shape and type as a given array.
numpy.asarray_chkfinite
Convert the input to an array, checking for NaNs or Infs.
numpy.diag_indices
Return the indices to access the main diagonal of an array.
numpy.ma.choose
Use an index array to construct a new array from a set of choices.
numpy.chararray.tolist
a.tolist()
numpy.matlib.rand
Return a matrix of random values with given shape.
numpy.savez_compressed
Save several arrays into a single file in compressed ``.npz`` format.
numpy.ma.empty_like
Return a new array with the same shape and type as a given array.
Return a boolean mask of the given shape, filled with False.
numpy.ma.mrecords.fromrecords
Creates a MaskedRecords from a list of records.
numpy.around
Evenly round to the given number of decimals.
numpy.source
Print or write to a file the source code for a Numpy object.
numpy.diagonal
Return specified diagonals.
numpy.histogram2d
Compute the bi-dimensional histogram of two data samples.
numpy.fft.ifft
Compute the one-dimensional inverse discrete Fourier Transform.
numpy.fft.ifftn
Compute the N-dimensional inverse discrete Fourier Transform.
numpy.busdaycalendar
A business day calendar object that efficiently stores information

``````
``````

In [12]:

np.con*?

``````
np.concatenate np.conj np.conjugate np.convolve

### Creating arrays

``````

In [13]:

a = np.array([0, 1, 2, 3])
a

``````
``````

Out[13]:

array([0, 1, 2, 3])

``````
``````

In [15]:

a.ndim

``````
``````

Out[15]:

1

``````
``````

In [16]:

a.shape

``````
``````

Out[16]:

(4,)

``````
``````

In [ ]:

len(a)

``````
``````

In [23]:

b = np.array([(0, 1, 2), [3, 4, 5]])
b

``````
``````

Out[23]:

array([[0, 1, 2],
[3, 4, 5]])

``````
``````

In [25]:

b.ndim

``````
``````

Out[25]:

2

``````
``````

In [19]:

len(b) # returns the size of the first dimension

``````
``````

Out[19]:

2

``````
``````

In [20]:

b.shape

``````
``````

Out[20]:

(2, 3)

``````
``````

In [26]:

c = np.array([[[1, 2], [3, 4], [5, 6]], [[7, 8], [9, 10], [11, 12]]])
c

``````
``````

Out[26]:

array([[[ 1,  2],
[ 3,  4],
[ 5,  6]],

[[ 7,  8],
[ 9, 10],
[11, 12]]])

``````
``````

In [27]:

c.shape

``````
``````

Out[27]:

(2, 3, 2)

``````

In practice, we rarely enter items one by one...

• Evenly spaced:
``````

In [28]:

a = np.arange(10) # 0 .. n-1 (!)
a

``````
``````

Out[28]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

``````
``````

In [31]:

b = np.arange(1, 9, 1) # start, end (exlusive), step
b

``````
``````

Out[31]:

array([1, 2, 3, 4, 5, 6, 7, 8])

``````
• or by number of points:
``````

In [33]:

c = np.linspace(0, 1, 6, endpoint=False) # start, end, num-points
c

``````
``````

Out[33]:

array([ 0.        ,  0.16666667,  0.33333333,  0.5       ,  0.66666667,
0.83333333])

``````
``````

In [ ]:

d = np.linspace(0, 1, 5, endpoint=False)
d

``````
• Common arrays:
``````

In [37]:

a = np.ones((3, 3)) # reminder: (3, 3) is a tuple
a

``````
``````

Out[37]:

array([[ 1.,  1.,  1.],
[ 1.,  1.,  1.],
[ 1.,  1.,  1.]])

``````
``````

In [38]:

b = np.zeros((2, 2))
b

``````
``````

Out[38]:

array([[ 0.,  0.],
[ 0.,  0.]])

``````
``````

In [39]:

c = np.eye(3)
c

``````
``````

Out[39]:

array([[ 1.,  0.,  0.],
[ 0.,  1.,  0.],
[ 0.,  0.,  1.]])

``````
``````

In [40]:

d = np.diag(np.array([1, 2, 3, 4]))
d

``````
``````

Out[40]:

array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])

``````
• np.random: random numbers (Mersenne Twister PRNG):
``````

In [41]:

a = np.random.rand(4) # uniform in [0, 1]
a

``````
``````

Out[41]:

array([ 0.09739491,  0.13817063,  0.67998996,  0.83138239])

``````
``````

In [55]:

b = np.random.randn(4) # Gaussian
b

``````
``````

Out[55]:

array([ 0.47143516, -1.19097569,  1.43270697, -0.3126519 ])

``````
``````

In [54]:

np.random.seed(1234) # Setting the random seed

``````

### Basic data types

You may have noticed that, in some instances, array elements are displayed with a trailing dot (e.g. 2. vs 2).

This is due to a difference in the data-type used:

``````

In [62]:

a = np.array([1, 2, 3]).astype(float)
a[0] = 2121.2
a

``````
``````

Out[62]:

array([  2.12120000e+03,   2.00000000e+00,   3.00000000e+00])

``````
``````

In [57]:

b = np.array([1., 2., 3.])
b.dtype

``````
``````

Out[57]:

dtype('float64')

``````

Different data-types allow us to store data more compactly in memory, but most of the time we simply work with floating point numbers.

Note that, in the example above, NumPy auto-detects the data-type from the input.

• You can explicitly specify which data-type you want:
``````

In [ ]:

c = np.array([1, 2, 3], dtype=int)
c.dtype

``````
• The default data type is floating point:
``````

In [ ]:

a = np.ones((3, 3))
a.dtype

``````

There are also other types:

• Complex
``````

In [ ]:

d = np.array([1+2j, 3+4j, 5+6*1j])
d.dtype

``````
• Bool
``````

In [ ]:

e = np.array([True, False, False, True])
e.dtype

``````
• Strings
``````

In [ ]:

f = np.array(['a','happy','world'])
f.dtype # <--- strings containing max. 5 letters

``````
``````

In [ ]:

f[:] = 'abcdefg'
f

``````
• Much more, ... int32/int64 ...

### Indexing and slicing

The items of an array can be accessed and assigned to the same way as other Python sequences (e.g. lists)

``````

In [63]:

a = np.arange(10)
a

``````
``````

Out[63]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

``````
``````

In [66]:

a[-4]

``````
``````

Out[66]:

6

``````

Warning: Indices begin at 0, like other Python sequences (and C/C++).

In contrast, in Fortran or Matlab, indices begin at 1.

• For multidimensional arrays, indices are tuples of integers:
``````

In [67]:

a = np.diag(np.arange(3))
a

``````
``````

Out[67]:

array([[0, 0, 0],
[0, 1, 0],
[0, 0, 2]])

``````
``````

In [68]:

a[1, 1]

``````
``````

Out[68]:

1

``````
``````

In [69]:

a[2, 1] = 10 # third line, second column
a

``````
``````

Out[69]:

array([[ 0,  0,  0],
[ 0,  1,  0],
[ 0, 10,  2]])

``````
``````

In [70]:

a[1]

``````
``````

Out[70]:

array([0, 1, 0])

``````

Note that:

• In 2D, the first dimension corresponds to rows, the second to columns.
• for multidimensional a, a[0] is interpreted by taking all elements in the unspecified dimensions.

Slicing

• Arrays, like other Python sequences can also be sliced:
``````

In [71]:

a = np.arange(10)
a

``````
``````

Out[71]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

``````
``````

In [72]:

a[2:9:3] # [start:end:step]

``````
``````

Out[72]:

array([2, 5, 8])

``````

Note that:

• last index is not included!
``````

In [73]:

a[:4]

``````
``````

Out[73]:

array([0, 1, 2, 3])

``````
• All three slice components are optional
• by default, start is 0, end is the last and step is 1:
``````

In [ ]:

a[1:3]

``````
``````

In [74]:

a[::2]

``````
``````

Out[74]:

array([0, 2, 4, 6, 8])

``````
``````

In [ ]:

a[3:]

``````

### Copies and views

• A slicing operation creates a view on the original array.

• This is just a way of accessing array data. Thus the original array is not copied in memory.

• When modifying the view, the original array is modified as well:

``````

In [75]:

a = np.arange(10)
a

``````
``````

Out[75]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

``````
``````

In [76]:

b = a[::2]
b

``````
``````

Out[76]:

array([0, 2, 4, 6, 8])

``````
``````

In [77]:

b[0] = 12
b

``````
``````

Out[77]:

array([12,  2,  4,  6,  8])

``````
``````

In [83]:

x = [1,2,3,4]
b = x[0:2]
b[0]=100000
x

``````
``````

Out[83]:

[1, 2, 3, 4]

``````
``````

In [79]:

a = np.arange(10)
b = a[::2].copy() # force a copy
b[0] = 12
a

``````
``````

Out[79]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

``````
• This behavior can be surprising at first sight...

• but it allows to save both memory and time.

Warning: The transpose is a view

As a result, a matrix cannot be made symmetric in-place:

``````

In [86]:

a = np.ones((100, 100))
a += a.T
a

``````
``````

Out[86]:

array([[ 2.,  2.,  2., ...,  2.,  2.,  2.],
[ 2.,  2.,  2., ...,  2.,  2.,  2.],
[ 2.,  2.,  2., ...,  2.,  2.,  2.],
...,
[ 3.,  3.,  3., ...,  2.,  2.,  2.],
[ 3.,  3.,  3., ...,  2.,  2.,  2.],
[ 3.,  3.,  3., ...,  2.,  2.,  2.]])

``````
``````

In [87]:

a = np.ones((100,100))
a += a.copy().T
a

``````
``````

Out[87]:

array([[ 2.,  2.,  2., ...,  2.,  2.,  2.],
[ 2.,  2.,  2., ...,  2.,  2.,  2.],
[ 2.,  2.,  2., ...,  2.,  2.,  2.],
...,
[ 2.,  2.,  2., ...,  2.,  2.,  2.],
[ 2.,  2.,  2., ...,  2.,  2.,  2.],
[ 2.,  2.,  2., ...,  2.,  2.,  2.]])

``````

# A worked Exercise: Prime number sieve

Compute prime numbers in 0–99, with a "sieve"

• Construct a shape (100,) boolean array is_prime, filled with True in the beginning

• Cross out 0 and 1 which are not primes

• For each integer j starting from 2, cross out its higher multiples

• Construct a shape (100,) boolean array is_prime, filled with True in the beginning:
``````

In [ ]:

``````
• Cross out 0 and 1 which are not primes:
``````

In [ ]:

``````
• For each integer j starting from 2, cross out its higher multiples:
``````

In [ ]:

``````
• Skim through help(np.nonzero), and print the prime numbers
• Follow-up:
• Move the above code into a script file named prime_sieve.py
• Run it to check it works
``````

In [ ]:

help(np.nonzero)

``````
``````

In [ ]:

``````
• Convert the simple sieve to the sieve of Eratosthenes:
1. Skip j which are already known to not be primes
2. The first number to cross out is j^2
``````

In [ ]:

``````

Indexing with the np.newaxis object allows us to add an axis to an array:

``````

In [88]:

z = np.array([1, 2, 3])
z

``````
``````

Out[88]:

array([1, 2, 3])

``````
``````

In [89]:

z[:, np.newaxis]

``````
``````

Out[89]:

array([[1],
[2],
[3]])

``````
``````

In [90]:

z[np.newaxis, :]

``````
``````

Out[90]:

array([[1, 2, 3]])

``````

### Fancy indexing

• Numpy arrays can be indexed with slices

• but also with boolean or integer arrays (masks)

• This method is called fancy indexing.

• It creates copies not views.

``````

In [96]:

np.random.seed(3)
a = np.random.random_integers(0, 20, 15)
a

``````
``````

Out[96]:

array([10,  3,  8,  0, 19, 10, 11,  9, 10,  6,  0, 20, 12,  7, 14])

``````
``````

In [92]:

(a % 3 == 0)

``````
``````

Out[92]:

array([False,  True, False,  True, False, False, False,  True, False,
True,  True, False,  True, False, False], dtype=bool)

``````
``````

In [97]:

mask = (a % 3 == 0)
extract_from_a = a[mask] # or, a[a%3==0]
extract_from_a # extract a sub-array with the mask
extract_from_a[0] = 9999999
a

``````
``````

Out[97]:

array([10,  3,  8,  0, 19, 10, 11,  9, 10,  6,  0, 20, 12,  7, 14])

``````
• Indexing with a mask can be very useful to assign a new value to a sub-array:
``````

In [98]:

a[a % 3 == 0] = -1
a

``````
``````

Out[98]:

array([10, -1,  8, -1, 19, 10, 11, -1, 10, -1, -1, 20, -1,  7, 14])

``````

Indexing with an array of integers

``````

In [102]:

a = np.arange(2,12)
a

``````
``````

Out[102]:

array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

``````
• Indexing can be done with an array of integers, where the same index is repeated several time:
``````

In [103]:

a[[2, 3, 2, 4, 2]] # note: [2, 3, 2, 4, 2] is a Python list

``````
``````

Out[103]:

array([4, 5, 4, 6, 4])

``````
• New values can be assigned with this kind of indexing:
``````

In [104]:

a[[9, 7]] = -10
a

``````
``````

Out[104]:

array([  2,   3,   4,   5,   6,   7,   8, -10,  10, -10])

``````
• When a new array is created by indexing with an array of integers, the new array has the same shape as the array of integers:
``````

In [ ]:

a = np.arange(10)
idx = np.array([[3, 4], [9, 7]])
a[idx]

``````

Consider the following examples of fancy indexing:

``````

In [ ]:

a = np.arange(36).reshape(6,6)
a

``````
``````

In [ ]:

a[(0,1,2,3,4),(1,2,3,4,5)]

``````
``````

In [ ]:

a[:3,[0,2,5]]

``````
``````

In [ ]:

``````
• We can even use fancy indexing and broadcasting (later) at the same time:
``````

In [ ]:

a = np.arange(12).reshape(3,4)
a

``````
``````

In [ ]:

i = np.array([[0, 1], [1, 2]])
a[i, 2] # same as a[i, 2*np.ones((2, 2), dtype=int)]

``````

### Numerical operations on arrays

Most of numpy and scipy's mathematical operations are vectorized. Applying these operations to arrays means they are applied elementwise. However, similar to MATLAB, internally the elementwise operations are implemented in C and C++. Thus vectorized computations are much faster than if you would simply loop through each element yourself and apply your operation manually.

So alway try think vectorized instead of manually looping through arrays.

Elementwise operations

• With scalars:
``````

In [105]:

a = np.array([1, 2, 3, 4])
a + 1

``````
``````

Out[105]:

array([2, 3, 4, 5])

``````
``````

In [106]:

2**a

``````
``````

Out[106]:

array([ 2,  4,  8, 16])

``````
``````

In [107]:

"""Speed up of vectorization"""
def twice_each_element(a):
b=np.zeros(a.shape)

for idx, elem in enumerate(a):
b[idx] = 2*elem

return b

a = np.arange(100)

%timeit twice_each_element(a)
%timeit 2*a

``````
``````

10000 loops, best of 3: 35.3 µs per loop
1000000 loops, best of 3: 944 ns per loop

``````
• All arithmetic operates elementwise:
``````

In [109]:

a = np.array([1, 2, 3, 4])
b = np.ones(4) + 1
print a
print b
a - b

``````
``````

[1 2 3 4]
[ 2.  2.  2.  2.]

Out[109]:

array([-1.,  0.,  1.,  2.])

``````
``````

In [112]:

a*b

``````
``````

Out[112]:

array([ 2.,  4.,  6.,  8.])

``````
``````

In [113]:

j = np.arange(5)
2**(j + 1) - j

``````
``````

Out[113]:

array([ 2,  3,  6, 13, 28])

``````

Warning:

• Array multiplication is not matrix multiplication:
``````

In [114]:

c = np.ones((3, 3))
c * c # NOT matrix multiplication!

``````
``````

Out[114]:

array([[ 1.,  1.,  1.],
[ 1.,  1.,  1.],
[ 1.,  1.,  1.]])

``````
• Note: Matrix multiplication:
``````

In [115]:

c.dot(c)

``````
``````

Out[115]:

array([[ 3.,  3.,  3.],
[ 3.,  3.,  3.],
[ 3.,  3.,  3.]])

``````
• Comparisons:
``````

In [116]:

a = np.array([1, 2, 3, 4])
b = np.array([4, 2, 2, 4])
a == b

``````
``````

Out[116]:

array([False,  True, False,  True], dtype=bool)

``````
``````

In [117]:

a > b

``````
``````

Out[117]:

array([False, False,  True, False], dtype=bool)

``````
• Logical operations:
``````

In [118]:

a = np.array([1, 1, 0, 0], dtype=bool)
b = np.array([1, 0, 1, 0], dtype=bool)
np.logical_or(a, b)

``````
``````

Out[118]:

array([ True,  True,  True, False], dtype=bool)

``````
``````

In [ ]:

np.logical_and(a, b)

``````
• Shape mismatches:
``````

In [119]:

a = np.arange(4)
a

``````
``````

Out[119]:

array([0, 1, 2, 3])

``````
``````

In [120]:

a + np.array([1, 2])

``````
``````

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-120-033edaa49cc8> in <module>()
----> 1 a + np.array([1, 2])

ValueError: operands could not be broadcast together with shapes (4,) (2,)

``````
• Transposition:
``````

In [121]:

a = np.triu(np.ones((3, 3)), 1) # see help(np.triu)
a

``````
``````

Out[121]:

array([[ 0.,  1.,  1.],
[ 0.,  0.,  1.],
[ 0.,  0.,  0.]])

``````
``````

In [122]:

a.T

``````
``````

Out[122]:

array([[ 0.,  0.,  0.],
[ 1.,  0.,  0.],
[ 1.,  1.,  0.]])

``````

Note: Linear algebra

• The sub-module np.linalg implements basic linear algebra

• solving linear systems
• singular value decomposition
• etc
• However,

• it is not guaranteed to be compiled using efficient routines
• thus the use of scipy.linalg is recommended...

Basic reductions

• Computing sums:
``````

In [123]:

x = np.array([1, 2, 3, 4])
np.sum(x)

``````
``````

Out[123]:

10

``````
``````

In [ ]:

x.sum()

``````
• Sum by rows and by columns:
``````

In [124]:

x = np.array([[1, 1], [2, 2]])
x

``````
``````

Out[124]:

array([[1, 1],
[2, 2]])

``````
``````

In [125]:

x.sum(axis=0) # columns (first dimension)

``````
``````

Out[125]:

array([3, 3])

``````
``````

In [126]:

x[:, 0].sum(), x[:, 1].sum()

``````
``````

Out[126]:

(3, 3)

``````
``````

In [ ]:

x.sum(axis=1) # rows (second dimension)

``````
``````

In [ ]:

x[0, :].sum(), x[1, :].sum()

``````
• Same idea in higher dimensions:
``````

In [ ]:

x = np.random.rand(2, 2, 2)
x.sum(axis=2)[0, 1]

``````
``````

In [ ]:

x[0, 1, :].sum()

``````

Other reductions— works the same way (and take axis=)

• Statistics:
``````

In [ ]:

x = np.array([1, 2, 3, 1])
y = np.array([[1, 2, 3], [5, 6, 1]])
x.mean()

``````
``````

In [ ]:

np.median(x)

``````
``````

In [ ]:

np.median(y, axis=-1) # last axis

``````
``````

In [ ]:

x.std() # full population standard dev.

``````
• Extrema:
``````

In [ ]:

x = np.array([1, 3, 2])
x.min()

``````
``````

In [ ]:

x.max()

``````
``````

In [ ]:

x.argmin() # index of minimum

``````
``````

In [ ]:

x.argmax() # index of maximum

``````
• Logical operations:
``````

In [ ]:

np.all([True, True, False])

``````
``````

In [ ]:

np.any([True, True, False])

``````

Note: Can be used for array comparisons:

``````

In [ ]:

a = np.zeros((100, 100))
np.any(a != 0)

``````
``````

In [ ]:

np.all(a == a)

``````
``````

In [ ]:

a = np.array([1, 2, 3, 2])
b = np.array([2, 2, 3, 2])
c = np.array([6, 4, 4, 5])
((a <= b) & (b <= c)).all()

``````
• ... and many more (best to learn as you go).

• Basic operations on numpy arrays (addition, etc.) are elementwise
• This works on arrays of the same size.

Nevertheless

• It’s also possible to do operations on arrays of different sizes
• if Numpy can transform these arrays so that they all have the same size
• this conversion is called broadcasting.
• Consider the following example of broadcasting:
``````

In [134]:

a = np.tile(np.arange(0, 40, 10), (3, 1)).T
a

``````
``````

Out[134]:

array([[ 0,  0,  0],
[10, 10, 10],
[20, 20, 20],
[30, 30, 30]])

``````
``````

In [131]:

b = np.array([0, 1, 2])
b

``````
``````

Out[131]:

array([0, 1, 2])

``````
``````

In [135]:

a+b

``````
``````

Out[135]:

array([[ 0,  1,  2],
[10, 11, 12],
[20, 21, 22],
[30, 31, 32]])

``````
• A useful trick:
``````

In [136]:

a = np.arange(0, 40, 10)
a.shape

``````
``````

Out[136]:

(4,)

``````
``````

In [137]:

a = a[:, np.newaxis] # adds a new axis -> 2D array
a.shape

``````
``````

Out[137]:

(4, 1)

``````
``````

In [140]:

b

``````
``````

Out[140]:

array([0, 1, 2])

``````
``````

In [141]:

a

``````
``````

Out[141]:

array([[ 0],
[10],
[20],
[30]])

``````
``````

In [142]:

a + b

``````
``````

Out[142]:

array([[ 0,  1,  2],
[10, 11, 12],
[20, 21, 22],
[30, 31, 32]])

``````
``````

In [ ]:

a = np.ones((4, 5))
a[0] = 2 # we assign an array of dimension 0 to an array of dimension 1
a

``````
• Broadcasting seems a bit magical
• but it is actually quite natural to use it
• e.g., when we want to solve a problem whose output data is an array with more dimensions than input data.

## Array shape manipulation

### Flattening

``````

In [ ]:

import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6]])
a.ravel()

``````
``````

In [ ]:

a.T.ravel()

``````
• Higher dimensions: last dimensions ravel out “first”.

### Reshaping

• The inverse operation to flattening:
``````

In [ ]:

a.shape

``````
``````

In [ ]:

b = a.ravel()
b.reshape((2, 3))

``````
• Creating an array with a different shape, from another array:
``````

In [ ]:

a = np.arange(36)
b = a.reshape((6, 6))
b

``````
• Or,
``````

In [ ]:

b = a.reshape((6, -1)) # unspecified (-1) value is inferred
b

``````

### Views and copies

• ndarray.reshape may return a view (cf help(np.reshape))), not a copy:
``````

In [ ]:

b[0, 0] = 99
a

``````
• Beware: reshape may also return a copy!
``````

In [ ]:

a = np.zeros((3, 2))
b = a.T.reshape(3*2)
b[0] = 9
a

``````
• To understand, one would need to take a deeper look into the memory layout of an array.

### Dimension shuffling

``````

In [ ]:

a = np.arange(4*3*2).reshape(4, 3, 2)
a.shape

``````
``````

In [ ]:

a[0, 2, 1]

``````
``````

In [ ]:

b = a.transpose(1, 2, 0)
b.shape

``````
``````

In [ ]:

b[2, 1, 0]

``````
• Also creates a view:
``````

In [ ]:

b[2, 1, 0] = -1
a[0, 2, 1]

``````

### Resizing

• Size of an array can be changed with ndarray.resize:
``````

In [ ]:

a = np.arange(4)
a.resize((8,))
a

``````

However, it must not be referred to somewhere else:

``````

In [ ]:

b = a
a.resize((4,))

``````

### Sorting data

• Sorting along an axis:
``````

In [ ]:

a = np.array([[4, 3, 5], [1, 2, 1]])
b = np.sort(a, axis=1)
b

``````
• Note: Sorts each row separately!

• In-place sort:

``````

In [ ]:

a.sort(axis=1)
a

``````
• Sorting with fancy indexing:
``````

In [ ]:

a = np.array([4, 3, 1, 2])
j = np.argsort(a)
j

``````
``````

In [ ]:

a[j]

``````
• Finding minima and maxima:
``````

In [ ]:

a = np.array([4, 3, 1, 2])
j_max = np.argmax(a)
j_min = np.argmin(a)
j_max, j_min

``````

# Basic Visualization

(See Ch. 3, "Python Scientific lecture notes", Release 2012.3 (EuroScipy 2012)

by the EuroScipy tutorial team

Editors: Valentin Haenel, Emmanuelle Gouillart, Gaël Varoquaux

You may also need to visualize your data arrays.

Matplotlib is a Python plotting library. Its capabilities and interface are quite similar to the ones offered by Matlab. It can be used to display the results of your calculations, create publication quality gures and can be embedded in graphical user interfaces.

• Matplotlib is a 2D plotting package.

• We can import its functions as below:

``````

In [ ]:

import matplotlib.pyplot as plt # the tidy way
import numpy as np

``````

## 1D plotting

``````

In [ ]:

x = np.linspace(0, 3, 20)
y = np.linspace(0, 9, 20)
plt.figure()
plt.plot(x, y) # line plot
plt.plot(x, y,'o') # dot plot
plt.show() # <-- shows the plot (not needed with Ipython)

``````

### Plotting into the notebook

``````

In [ ]:

# Add the following line to make plots *appear* within the iypthon notebook
%matplotlib inline

``````

## Labels

``````

In [ ]:

x = np.linspace(0, 3, 20)
y = np.linspace(0, 9, 20)

plt.figure()
plt.plot(x, y) # line plot
plt.plot(x, y,'o') # dot plot

plt.xlabel('input')
plt.ylabel('output')
plt.title('Input-Output-Relationship')

``````

## 2D plotting

(such as images)

``````

In [ ]:

image = np.random.rand(30, 30)
plt.imshow(image, cmap=plt.cm.gray)
plt.colorbar()

``````

## Subplots

``````

In [ ]:

x = np.linspace(0, 3, 20)
y = np.linspace(0, 9, 20)
image = np.random.rand(30, 30)

plt.figure()

plt.subplot(1,2,1)
plt.plot(x, y) # line plot
plt.plot(x, y,'o') # dot plot

plt.subplot(1,2,2)
plt.imshow(image, cmap=plt.cm.gray)
plt.colorbar()

``````

## An example of broadcasting in grid-based computations

• A lot of grid-based or network-based problems can also use broadcasting.
• For instance, if we want to compute the distance from the origin of points on a 5 x 5 grid, we can do:
``````

In [ ]:

x, y = np.arange(5), np.arange(5)
distance = np.sqrt(x ** 2 + y[:, np.newaxis] ** 2)
distance

``````
``````

In [ ]:

plt.pcolor(distance)
plt.colorbar()
plt.axis('equal')

``````
• or for a set of more general distances...
``````

In [ ]:

Mileposts = np.random.random_integers(1,50,10)
Mileposts[0] = 0
Mileposts

``````
``````

In [ ]:

``````
``````

In [ ]:

PtPtDist

``````
``````

In [ ]:

plt.pcolor(PtPtDist)
plt.colorbar()
plt.axis('equal')

``````
``````

In [ ]:

``````
``````

In [ ]:

``````