## Why NumPy?

#### Python can be slow:

``````

In [ ]:

from __future__ import division, print_function
import itertools

``````
``````

In [ ]:

lst1 = range(1000000)
lst2 = lst1[::-1]  #Reverse the list

``````
``````

In [ ]:

%timeit [l1 + l2 for l1, l2 in itertools.izip(lst1, lst2)]

``````
``````

In [ ]:

#Cycles per addition (approx)
((109e-3)*(2.4e9))/1000000

``````

## Why is Python so slow?

``````

In [ ]:

from IPython.core.display import Image
Image(filename="python_memory_model.png")

``````

### Can we do any better?

``````

In [ ]:

import numpy as np

``````
``````

In [ ]:

arr1 = np.array(lst1)
arr2 = np.array(lst2)

``````
``````

In [ ]:

%timeit arr1+arr2

``````
``````

In [ ]:

((1.97e-3)*(2.4e9))/1000000

``````
``````

In [ ]:

Image(filename="numpy_memory_model.png")

``````

#### But, there are some tradeoffs to this.

``````

In [ ]:

a = np.array([(2<<30)-1],dtype=np.int32)

``````
``````

In [ ]:

a

``````
``````

In [ ]:

a+1

``````

#### Uh-oh.

``````

In [ ]:

a = np.array([-.5])

``````
``````

In [ ]:

a

``````
``````

In [ ]:

a/0

``````
``````

In [ ]:

a**.5

``````

### Performance is not a free lunch either

``````

In [ ]:

temperatures_f = np.array([i for i in xrange(32,212)])

``````
``````

In [ ]:

temperatures_c = (temperatures_f -32)*5/9.0

``````

### Numpy offers a number of convenient ways to create arrays

``````

In [ ]:

a = np.arange(0, 20, 2, dtype=None)

``````
``````

In [ ]:

np.empty((4,5), dtype=float, order=None)

``````
``````

In [ ]:

np.zeros((4,5), dtype=float, order=None)

``````
``````

In [ ]:

np.ones((4,5), dtype=float, order=None)

``````
``````

In [ ]:

np.asarray([[i for i in xrange(20)], [j for j in xrange(10)]], dtype=None)
#These will fail as "collection" and iterable is not defined
#np.array(collection, dtype=None, copy=True, order=None)
#np.fromiter(iterable, dtype, count=-1)

``````
``````

In [ ]:

a = np.arange(12)
a

``````
``````

In [ ]:

a = a.reshape(3,4)
a

``````
``````

In [ ]:

(a*10).reshape(2,6)

``````
``````

In [ ]:

a

``````
``````

In [ ]:

a * [2,4,6,8] #The 4-vector will be broadcast

``````
``````

In [ ]:

a * [2,3,4] #This will cause an error, as a 3-vector cannot be broadcast (the dimensions do not match)

``````

#### We can get views of the data by indexing

``````

In [ ]:

a = np.arange(12)
a

``````
``````

In [ ]:

b = a[::2]
b

``````
``````

In [ ]:

b[2] = -1
b

``````
``````

In [ ]:

a

``````
``````

In [ ]:

b.flags['OWNDATA']

``````
``````

In [ ]:

b.base is a

``````

#### We can index by a list of ints, and get an array of those items

``````

In [ ]:

a = np.arange(10)*10
a

``````
``````

In [ ]:

b = a[[4,3,-2]]
b

``````
``````

In [ ]:

b.flags["OWNDATA"] #Note that this gives us a copy of the data

``````

#### We can index by a list of boolean values as well

``````

In [ ]:

a = a.reshape((5,2))

``````
``````

In [ ]:

a[(a%3)==0].shape

``````
``````

In [ ]:

b = ((a%3)==0)

``````
``````

In [ ]:

a[b]

``````

### Let's dive a bit deeper into the memory layout

``````

In [ ]:

a = np.arange(3000000)
a.shape = (5,3,200000)
a

``````
``````

In [ ]:

b = a.swapaxes(1,2) #Swap the last two axes
print(b.shape)
print(a.shape)
print(b.flags["OWNDATA"])

``````

#### When changing the shape, it helps to remember how arrays are laid out in memory

``````

In [ ]:

a.shape = (5,600000)

``````
``````

In [ ]:

a.shape = (5,3,200000) #Reset the shape of a, before we reshape a different way

``````
``````

In [ ]:

a.shape = (1000000,3)

``````
``````

In [ ]:

print(b.shape)
b.shape = (1000000,3)

``````

#### If we take a look at the flags of the arrays, we can see why this error message happened

``````

In [ ]:

a.flags

``````
``````

In [ ]:

b.flags

``````

### We can also do some fun graphing with matplotlib

``````

In [ ]:

import random
%pylab inline --no-import-all

``````
``````

In [ ]:

pylab.ion()
pylab.figure()
pylab.plot([random.gauss(10, 3) for i in xrange(30)], 'g')
pylab.ioff()

``````

#### SymPy lets us do symbolic manipulation

``````

In [ ]:

from sympy import symbols, limit, log, integrate, Integral, sqrt

``````
``````

In [ ]:

x, y = symbols('x y')

``````
``````

In [ ]:

limit (x*log (x),x,0)

``````
``````

In [ ]:

limit (x*log (x),x,20)

``````
``````

In [ ]:

integrate(x/(x**2+2*x+1), x)

``````
``````

In [ ]:

from sympy import latex, init_printing

``````
``````

In [ ]:

integrate(x/(x**2+2*x+1), x)

``````

#### That works, but it's a bit ugly. Can we do any better?

``````

In [ ]:

init_printing()

``````
``````

In [ ]:

integrate(x/(x**2+2*x+1), x)

``````
``````

In [ ]:

Integral(sqrt(1/x), x)

``````

### We can solve equations symbolically

``````

In [ ]:

from sympy import solve, Eq
solve(Eq(x**2, 1), x)

``````

### DiffEqs? No sweat!

``````

In [ ]:

from sympy import dsolve, Function, sin
f, g = symbols('f g', cls=Function)
diffeq = Eq(f(x).diff(x, x) - 2*f(x).diff(x) + f(x), sin(x))
diffeq

``````
``````

In [ ]:

dsolve(diffeq, f(x))

``````
``````

In [ ]:

from sympy import Matrix
M = Matrix(( [1, 2, 1], [6, -1, 0], [-1, -2, -1] ))
M.eigenvals()

``````
``````

In [ ]:

M.eigenvects()

``````
``````

In [ ]:

M = Matrix(( [1, 2, 3], [3, 6, 2], [2, 0, 1] ))
M.eigenvals()

``````