Pre-MAP Course Website | Pre-MAP GitHub | Google

Python packages

We do specialized tasks in Python with packages. A package is a collection of Python functions that someone wrote and bundled together for you to use. Some of the Python packages that we'll learn to use include:

Package Uses
numpy Math with arrays (more on this below)
scipy A math toolkit built for use by scientists
matplotlib Visualization (plotting!)
astropy Astronomy-specific functions of all kinds

Numpy

Numpy is the most important package that we're going to teach you about, because it allows you to do calculations very quickly with Python. Below, we'll discover why it's useful.


Let's say you want to take the $\sin$ or $\cos$ of an angle. There are numpy function that do this for you.

To gain access to numpy's functions, you always need to do this command first:

import numpy as np

Run the line above in the cell below:


In [ ]:

Now there's a package stored in the variable called np that we can access anywhere in this notebook. There are functions for $\sin$ and $\cos$ that live within numpy. The way to access a function within a package is by calling the function name with a period after it, then the name of the function you want. So for $\sin$, you can do:

np.sin(0)

The np. part says "give me this function from numpy". The sin() part says "the function that I want to use is $\sin$", and the 0 is the angle that we want to take the $\sin$ of, in units of radians. Run that line in the cell below, and experiment with different angles. Try np.cos too.


In [ ]:

Numpy also has some built-in numbers that you might use. For example, $\pi$ is stored (to high precision!), in np.pi. Print out numpy's $\pi$ in the cell below:


In [ ]:

Now let's say you had a list of angles, like angles$= [0, \pi/2, \pi, 3\pi/2, 2\pi]$. You could call np.sin(angles[0]) to get the $\sin$ of the first angle, then np.sin(angles[1]) on the second angle, etc. But that would be a really slow way to do it!

Arrays

The quick way is to create a numpy array. A numpy array is a vector or matrix of numbers which numpy can act on more efficiently than Python can with ordinary lists.

Let's make a numpy array filled with the angles above:

# First, here's the list that we want to have an array of: 
angle_list = [0, 1/2 * np.pi, np.pi, 3/2 * np.pi, 2 * np.pi]

# Here's how we make a numpy array out of the list
angle_array = np.array(angle_list)

Write out those lines in the cell below.

Let's break down the command np.array(angle_list). The np. says we're going to use a function from numpy, the array() says we're going to make an array out of the thing in the parentheses, and the angle_list is the input or the argument of the function.

Now you can do things with the numpy array that you couldn't do with a Python list. Here are some of them, which you should experiment with in the cell below:

# Sum of all elements in the array:
angle_array.sum()

# Mean of all elements in the array:
angle_array.mean()

# Maximum of the elements in the array:
angle_array.max()

# Minimum of the elements in the array: 
angle_array.min()

# Standard deviation of the elements in the array: 
angle_array.std()

In [ ]:

Example 1: Calculations with arrays

What is the $\sin$ and $\cos$ of each angle? Use the numpy array angle_array as the argument to the np.sin and np.cos functions in the cell below:


In [ ]:

Now you might be saying - wait a minute, $\sin(3\pi/2) = 0$, not $\approx$1e-16, what's that about? The short answer is - computers often get very very close to approximating the numbers that we actually want, but not all of the way there. You can get better precision if you tell the computer to use more memory, which we can talk about after class if you like.


Array arithmetic

There are lots of situations where you'll want to create a certain kind of array, and numpy has functions to help.

You can make an array of consecutive integers from zero to nine with the function np.arange:

consecutive_integers = np.arange(10)

This function returns different things depending on the amount of arguments that you give it. If there's one number in the parentheses, it tells np.arange what number to stop at (exclusive). If there are three numbers, they signify np.arange(start, stop, step). For example, np.arange(1, 9, 2) would start at 1, stop before 9, with a step size of 2, so it would return [1, 3, 5, 7].

In the cell below, make an array with 10,000 sequential integers, starting with the zero. Save the array into a variable called consecutive_integers, and print it out:


In [ ]:

You'll see that numpy is polite. It knows you probably don't want to see all ten thousand integers, so it prints just the beginning and end of the array.

Exercise 2: Indexing/slicing numpy arrays

The same indexing and slicing rules that we learned for lists work on arrays. In the cell below, print the 42nd element of the consecutive_integers array, and print the 101-103rd (inclusive) elements of the array:


In [ ]:

Unlike lists, you can do arithmetic with numpy arrays. For example, if you had the following list:

heights = [162, 185, 174, 191]

and you tried to add one to the list, this is what you would get:

print(heights + 1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-ca23c21090bd> in <module>()
      1 heights = [162, 185, 174, 191]
----> 2 print(heights + 1)

TypeError: can only concatenate list (not "int") to list

However, this does work if heights is a numpy array.

Exercise 3: Array arithmetic

In the cell below:

  1. Create a variable called heights_array, which contains a numpy array of heights (using the np.array function we learned above)
  2. Try adding, multiplying, subtracting, dividing, exponentiating the array

In [ ]:

Exercise 4: Inequalities

You can also evaluate inequalities with whole arrays at once. Find which values of the array above are greater than 180:


In [ ]:

Notice - numpy arrays don't have to contain numbers (floats and integers). They can be booleans, and other things too!

A boolean is a special type with the value True or False.

Exercise 5: Fancy indexing

We now want to print only the heights in the array that are greater than 180. Given what you know so far, print the heights greater than 180, by accessing them with their indices, i.e. something like

print(heights_array[  ])
                    ^^
             put an index here

In [ ]:

In Exercise 4, you found that you can figure out which numbers in the array were greater than 180 all at once. It turns out, if you save that array of booleans:

heights_gt_180 = heights_array > 180

You can use heights_gt_180 like a group of indices on heights_array to get just the heights where heights_gt_180 == True.

In the cell below, try:

heights_gt_180 = heights_array > 180
print(heights_array[heights_gt_180])

Did it print out the right indices? What if you flip the greater than to a less than? What if you try == instead?


In [ ]:

Putting it all together

Using all of these skills together, let's do something that we couldn't do easily with a scientific calculator. Let's find sum of all of the positive, even numbers less than 10,000.

We'll do that in a few steps below:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


Exercise 6

Using the above steps as a template, figure out the sum of the odd numbers less than 100,000:


In [ ]:


In [ ]: