A jupyter notebook is a browser-based environment that integrates:

  • A Kernel (python)
  • Text
  • Executable code
  • Plots and images
  • Rendered mathematical equations

Cell

The basic unit of a jupyter notebook is a cell. A cell can contain any of the above elements.

In a notebook, to run a cell of code, hit Shift-Enter. This executes the cell and puts the cursor in the next cell below, or makes a new one if you are at the end. Alternately, you can use:

  • Alt-Enter to force the creation of a new cell unconditionally (useful when inserting new content in the middle of an existing notebook).
  • Control-Enter executes the cell and keeps the cursor in the same cell, useful for quick experimentation of snippets that you don't need to keep permanently.

Hello World


In [ ]:
print("Hello World!")

In [ ]:
# lines that begin with a # are treated as comment lines and not executed

# print("This line is not printed")

print("This line is printed")

Create a variable


In [ ]:
g = 3.0 * 2.0

In [ ]:
print(g)

or even easier:


In [ ]:
g

Datatypes

In computer programming, a data type is a classification identifying one of various types that data can have.

The most common data type we will see in this class are:

  • Integers (int): Integers are the classic cardinal numbers: ... -3, -2, -1, 0, 1, 2, 3, 4, ...

  • Floating Point (float): Floating Point are numbers with a decimal point: 1.2, 34.98, -67,23354435, ...

    • Floating point values can also be expressed in scientific notation: 1e3 = 1000
  • Booleans (bool): Booleans types can only have one of two values: True or False. In many languages 0 is considered False, and any other value is considered True.

  • Strings (str): Strings can be composed of one or more characters: ’a’, ’spam’, ’spam spam eggs and spam’. Usually quotes (’) are used to specify a string. For example ’12’ would refer to the string, not the integer.

Collections of Data Types

  • Scalar: A single value of any data type.

  • List: A collection of values. May be mixed data types. (1, 2.34, ’Spam’, True) including lists of lists: (1, (1,2,3), (3,4))

  • Array: A collection of values. Must be same data type. [1,2,3,4] or [1.2, 4.5, 2.6] or [True, False, False] or [’Spam’, ’Eggs’, ’Spam’]

  • Matrix: A multi-dimensional array: [[1,2], [3,4]] (an array of arrays).


In [ ]:
a = 1
b = 2.3
c = 2.3e4
d = True
e = "Spam"

In [ ]:
type(a), type(b), type(c), type(d), type(e)

In [ ]:
a + b, type(a + b)

In [ ]:
c + d, type(c + d)    # True = 1

In [ ]:
a + e

In [ ]:
str(a) + e

NumPy (Numerical Python) is the fundamental package for scientific computing with Python.

Load the numpy library:


In [ ]:
import numpy as np
+ - * / Standard math functions (add, subtract, multiply, divide) ** <-- for exponentiation (e.g. 2 ** 4 = 16) % for remainder or modulo, 400%360 = 40 == equal to != not equal to > greater than < less than >= greater than or equal to <= less than or equal to x += 2 same as x = x + 2 can also use -= *= /= **=

pi and e are built-in constants:


In [ ]:
np.pi, np.e

Arrays

  • Each element of the array has a Value
  • The position of each Value is called its Index

Our basic unit will be the NumPy array


In [ ]:
np.random.seed(42)                 # set the seed - everyone gets the same random numbers
x = np.random.randint(1,10,20)     # 20 random ints between 1 and 10
x

Indexing


In [ ]:
x[0]    # The Value at Index = 0

In [ ]:
x[-1]    # The last Value in the array x

Slices

x[start:stop:step]

  • start is the first Index that you want [default = first element]
  • stop is the first Index that you do not want [default = last element]
  • step defines size of step and whether you are moving forwards (positive) or backwards (negative) [default = 1]

In [ ]:
x

In [ ]:
x[0:4]           # first 4 items

In [ ]:
x[:4]            # same

In [ ]:
x[0:4:2]         # first four item, step = 2

In [ ]:
x[3::-1]         # first four items backwards, step = -1

In [ ]:
x[::-1]          # Reverse the array x

In [ ]:
print(x[-5:])    # last 5 elements of the array x

There are lots of different methods that can be applied to a NumPy array


In [ ]:
x.size                   # Number of elements in x

In [ ]:
x.mean()                 # Average of the elements in x

In [ ]:
x.sum()                  # Total of the elements in x

In [ ]:
x[-5:].sum()              # Total of last 5 elements in x

In [ ]:
x.cumsum()                # Cumulative sum

In [ ]:
x.cumsum()/x.sum()        # Cumulative percentage
x.[TAB] will give you all of the possibilites as a drop-down menu.

In [ ]:
x.

Help about a function:


In [ ]:
?x.min

NumPy math works over an entire array:


In [ ]:
y = x * 2
y

In [ ]:
sin(x)     # need to Numpy's math functions

In [ ]:
np.sin(x)

Masking - The key to fast programs


In [ ]:
mask1 = np.where(x>5)
x, mask1

In [ ]:
x[mask1], y[mask1]

In [ ]:
mask2 = np.where((x>3) & (x<7))
x[mask2]

Fancy masking


In [ ]:
mask3 = np.where(x >= 8)
x[mask3]

In [ ]:
# Set all values of x that match mask3 to 0

x[mask3] = 0
x

In [ ]:
mask4 = np.where(x != 0)
mask4

In [ ]:
#Add 10 to every value of x that matches mask4:

x[mask4] += 100
x

Sorting


In [ ]:
np.random.seed(13)                 # set the seed - everyone gets the same random numbers
z = np.random.randint(1,10,20)     # 20 random ints between 1 and 10
z

In [ ]:
np.sort(z)

In [ ]:
np.sort(z)[0:4]

In [ ]:
# Returns the indices that would sort an array

np.argsort(z)

In [ ]:
z, z[np.argsort(z)]

In [ ]:
maskS = np.argsort(z)

z, z[maskS]

Control Flow

Like all computer languages, Python supports the standard types of control flows including:

  • IF statements
  • FOR loops

In [ ]:
xx = -1

if xx > 0:
    print("This number is positive")
else:
    print("This number is NOT positive")

In [ ]:
xx = 0

if xx > 0:
    print("This number is positive")
elif xx == 0:
    print("This number is zero")
else:
    print("This number is negative")

For loops are different in python.

You do not need to specify the beginning and end values of the loop


In [ ]:
z

In [ ]:
for value in z:
    print(value)

In [ ]:
for idx,val in enumerate(z):
    print(idx,val)

In [ ]:
for idx,val in enumerate(z):
    if (val > 5):
        z[idx] = 0

In [ ]:
for idx,val in enumerate(z):
    print(idx,val)

Loops are slow in Python. Do not use them if you do not have to!


In [ ]:
np.random.seed(42)
BigZ = np.random.random(10000)    # 10,000 value array
BigZ[:10]

In [ ]:
# This is slow!

for Idx,Val in enumerate(BigZ):
    if (Val > 0.5):
        BigZ[Idx] = 0

BigZ[:10]

In [ ]:
%%timeit

for Idx,Val in enumerate(BigZ):
    if (Val > 0.5):
        BigZ[Idx] = 0

In [ ]:
# Masks are MUCH faster

mask = np.where(BigZ>0.5)
BigZ[mask] = 0

BigZ[:10]

In [ ]:
%%timeit -o

mask = np.where(BigZ>0.5)
BigZ[mask] = 0

Functions

In computer science, a function (also called a procedure, method, subroutine, or routine) is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code. The big advantage of a function is that it breaks a program into smaller, easier to understand pieces. It also makes debugging easier. A function can also be reused in another program.

The basic idea of a function is that it will take various values, do something with them, and return a result. The variables in a function are local. That means that they do not affect anything outside the function.

Below is a simple example of a function that solves the equation:

$ f(x,y) = x^2\ sin(y)$

In the example the name of the function is find_f (you can name functions what ever you want). The function find_f takes two arguments x and y, and returns the value of the equation to the main program. In the main program a variable named value_f is assigned the value returned by find_f. Notice that in the main program the function find_f is called using the arguments array_x and array_y. Since the variables in the function are local, you do not have name them x and y in the main program.


In [ ]:
def find_f(x,y):
    
    result = (x ** 2) * np.sin(y)           # assign the variable result the value of the function
    return result                           # return the value of the function to the main program

In [ ]:
np.random.seed(42)

array_x = np.random.rand(10) * 10
array_y = np.random.rand(10) * 2.0 * np.pi

In [ ]:
array_x, array_y

In [ ]:
value_f = find_f(array_x,array_y)

value_f

The results of one function can be used as the input to another function


In [ ]:
def find_g(z):
    
    result = z / np.e
    return result

In [ ]:
find_g(value_f)

In [ ]:
find_g(find_f(array_x,array_y))

Creating Arrays

Numpy has a wide variety of ways of creating arrays: Array creation routines


In [ ]:
# a new array filled with zeros

array_0 = np.zeros(10)

array_0

In [ ]:
# a new array filled with ones

array_1 = np.ones(10)

array_1

In [ ]:
# a new array filled with evenly spaced values within a given interval

array_2 = np.arange(10,20)

array_2

In [ ]:
# a new array filled with evenly spaced numbers over a specified interval (start, stop, num)

array_3 = np.linspace(10,20,5)

array_3

In [ ]:
# a new array filled with evenly spaced numbers over a log scale. (start, stop, num, base)

array_4 = np.logspace(1,2,5,10)

array_4