Class 3: NumPy (and a quick string example)

Brief introduction to the NumPy module.

Preliminary example

I recently found myself needing to copy and paste names and email addresses from an email header. I required the names and emails to formatted like this:

Name 1    Email 1
Name 2    Email 2
Name 3    Email 3

But what I had was this:

"Carl Friedrich Gauss" <approximatelynormal@email.com>, "Leonhard Euler <e@email.com>, "Bernhard Riemann" <zeta@email.com>

Sure, I could manually go through delete the characters that aren't required. The manual approach would be fine for a small list but the exercise would quickly become obnoxious as the list of names increases.

Python is great for modifying strings. The string method that we want to use is replace(). replace() has two required arguments: old and new. old is the sbstring that is to be replaced and new is what replaces the original substring. The replace() method does not change the value of the original string, but returns a new string.

For example, suppose that we want to remove the every 'p' from the string 'apple'.



In [ ]:

    
# Create a variable that stores the strong called 'apple'
a = 'apple'

# Create a copy of a with the ps removed and reassign the value of a
a = a.replace('p','')
print(a)

You can apply the replace() method multiple times:



In [ ]:

    
# Create a variable that stores the strong called 'apple'
a = 'apple'

# Create a copy of a with the ps, l, and e removed and reassign the value of a
a = a.replace('p','').replace('l','').replace('e','')
print(a)

Now we have the tools to solve the email problem.



In [ ]:

    
# Original character string
string = '"Carl Friedrich Gauss" <approximatelynormal@email.com>, "Leonhard Euler" <e@email.com>, "Bernhard Riemann" <zeta@email.com>'

# Remove <, >, and " from string and overwrite and print the result


# Create a new variable called string_formatted with the commas replaced by the new line character '\n'


# Print string_formatted

A related problem might be to extract only the email address from the orginal string. To do this, we can use replace() method to remove the '<', '>', and ',' characters. Then we use the split() method to break the string apart at the spaces. The we loop over the resulting list of strings and take only the strings with '@' characters in them.



In [ ]:

    
string = '"Carl Friedrich Gauss" <approximatelynormal@email.com>, "Leonhard Euler" <e@email.com>, "Bernhard Riemann" <zeta@email.com>'

Numpy

NumPy is a powerful Python module for scientific computing. Among other things, NumPy defines an N-dimensional array object that is especially convenient to use for plotting functions and for simulating and storing time series data. NumPy also defines many useful mathematical functions like, for example, the sine, cosine, and exponential functions and has excellent functions for probability and statistics including random number generators, and many cumulative density functions and probability density functions.

Importing NumPy

The standard way to import NumPy so that the namespace is np. This is for the sake of brevity.



In [ ]:

NumPy arrays

A NumPy ndarray is a homogeneous multidimensional array. Here, homogeneous means that all of the elements of the array have the same type. An nadrray is a table of numbers (like a matrix but with possibly more dimensions) indexed by a tuple of positive integers. The dimensions of NumPy arrays are called axes and the number of axes is called the rank. For this course, we will work almost exclusively with 1-dimensional arrays that are effectively vectors. Occasionally, we might run into a 2-dimensional array.

Basics

The most straightforward way to create a NumPy array is to call the array() function which takes as an argument a list. For example:



In [ ]:

    
# Create a variable called a1 equal to a numpy array containing the numbers 1 through 5


# Find the type of a1


# find the shape of a1


# Use ndim to find the rank or number of dimensions of a1



In [ ]:

    
# Create a variable called a2 equal to a 2-dimensionl numpy array containing the numbers 1 through 4


# find the shape of a2


# Use ndim to find the rank or number of dimensions of a2



In [ ]:

    
# Create a variable called c an empty numpy array


# find the shape of a3


# Use ndim to find the rank or number of dimensions of a3

Special functions for creating arrays

Numpy has several built-in functions that can assist you in creating certain types of arrays: arange(), zeros(), and ones(). Of these, arrange() is probably the most useful because it allows you a create an array of numbers by specifying the initial value in the array, the maximum value in the array, and a step size between elements. arrange() has three arguments: start, stop, and step:

arange([start,] stop[, step,])

The stop argument is required. The default for start is 0 and the default for step is 1. Note that the values in the created array will stop one increment below stop. That is, if arrange() is called with stop equal to 9 and step equal to 0.5, then the last value in the returned array will be 8.5.



In [ ]:

    
# Create a variable called b that is equal to a numpy array containing the numbers 1 through 5



In [ ]:

    
# Create a variable called c that is equal to a numpy array containing the numbers 0 through 10

The zeros() and ones() take as arguments the desired shape of the array to be returned and fill that array with either zeros or ones.



In [ ]:

    
# Construct a 1x5 array of zeros



In [ ]:

    
# Construct a 2x2 array of ones

Math with NumPy arrays

A nice aspect of NumPy arrays is that they are optimized for mathematical operations. The following standard Python arithemtic operators +, -, *, /, and ** operate element-wise on NumPy arrays as the following examples indicate.



In [ ]:

    
# Define two 1-dimensional arrays
A = np.array([2,4,6])
B = np.array([3,2,1])
C = np.array([-1,3,2,-4])



In [ ]:

    
# Multiply A by a constant



In [ ]:

    
# Exponentiate A



In [ ]:

    
# Add  A and B together



In [ ]:

    
# Exponentiate A with B



In [ ]:

    
# Add A and C together

The error in the preceding example arises because addition is element-wise and A and C don't have the same shape.



In [ ]:

    
# Compute the sine of the values in A

Iterating through Numpy arrays

NumPy arrays are iterable objects just like lists, strings, tuples, and dictionaries which means that you can use for loops to iterate through the elements of them.



In [ ]:

    
# Use a for loop with a NumPy array to print the numbers 0 through 4

Example: Basel problem

One of my favorite math equations is:

\begin{align} \sum_{n=1}^{\infty} \frac{1}{n^2} & = \frac{\pi^2}{6} \end{align}

We can use an iteration through a NumPy array to approximate the lefthand-side and verify the validity of the expression.



In [ ]:

    
# Set N equal to the number of terms to sum


# Initialize a variable called summation equal to 0


# loop over the numbers 1 through N


# Print the approximation and the exact solution