Pre-MAP Course Website | Pre-MAP GitHub | Google
We do specialized tasks in Python with packages. A package is a collection of Python functions that someone wrote and bundled together for you to use. Some of the Python packages that we'll learn to use include:
Package | Uses |
---|---|
numpy |
Math with arrays (more on this below) |
scipy |
A math toolkit built for use by scientists |
matplotlib |
Visualization (plotting!) |
astropy |
Astronomy-specific functions of all kinds |
Numpy is the most important package that we're going to teach you about, because it allows you to do calculations very quickly with Python. Below, we'll discover why it's useful.
Let's say you want to take the $\sin$ or $\cos$ of an angle. There are numpy function that do this for you.
To gain access to numpy's functions, you always need to do this command first:
import numpy as np
Run the line above in the cell below:
In [1]:
import numpy as np
Now there's a package stored in the variable called np
that we can access anywhere in this notebook. There are functions for $\sin$ and $\cos$ that live within numpy. The way to access a function within a package is by calling the function name with a period after it, then the name of the function you want. So for $\sin$, you can do:
np.sin(0)
The np.
part says "give me this function from numpy". The sin()
part says "the function that I want to use is $\sin$", and the 0
is the angle that we want to take the $\sin$ of, in units of radians. Run that line in the cell below, and experiment with different angles. Try np.cos
too.
In [ ]:
Numpy also has some built-in numbers that you might use. For example, $\pi$ is stored (to high precision!), in np.pi
. Print out numpy's $\pi$ in the cell below:
In [ ]:
Now let's say you had a list of angles, like angles
$= [0, \pi/2, \pi, 3\pi/2, 2\pi]$. You could call np.sin(angles[0])
to get the $\sin$ of the first angle, then np.sin(angles[1])
on the second angle, etc. But that would be a really slow way to do it!
The quick way is to create a numpy array. A numpy array is a vector or matrix of numbers which numpy can act on more efficiently than Python can with ordinary lists.
Let's make a numpy array filled with the angles above:
# First, here's the list that we want to have an array of:
angle_list = [0, 1/2 * np.pi, np.pi, 3/2 * np.pi, 2 * np.pi]
# Here's how we make a numpy array out of the list
angle_array = np.array(angle_list)
Write out those lines in the cell below.
Let's break down the command np.array(angle_list)
. The np.
says we're going to use a function from numpy, the array()
says we're going to make an array out of the thing in the parentheses, and the angle_list
is the input or the argument of the function.
Now you can do things with the numpy array that you couldn't do with a Python list. Here are some of them, which you should experiment with in the cell below:
# Sum of all elements in the array:
angle_array.sum()
# Mean of all elements in the array:
angle_array.mean()
# Maximum of the elements in the array:
angle_array.max()
# Minimum of the elements in the array:
angle_array.min()
# Standard deviation of the elements in the array:
angle_array.std()
In [ ]:
So what happens when you forget the name of a numpy
function, or the how to use a particular function? iPython
has some cool built-in features that you should take advantage of!
For example, say you forgot the name of the sum
function -- you can type np.
and then press tab
, and you'll see that the notebook lists what functions are available in numpy
! Spoiler alert, it's a loooong list since numpy
has many functionalities.
The point is you can use the tab
tool (recall tab completion trick) to help you remember or recognize the function you're looking for.
Just like how in bash
environments you can read details on how to use a function, you can do that in Python
as well. For example, say you forgot how to use the np.sin
function -- you can type np.sin?
+ return
and the notebook will return an inline window that tells you almost everything you need to know about the function.
Try playing around with np.
+ tab
and np.cos?
, or any function with a ?
at the end, below.
In [7]:
In [ ]:
Now you might be saying - wait a minute, $\sin(3\pi/2) = 0$, not $\approx$1e-16
, what's that about? The short answer is - computers often get very very close to approximating the numbers that we actually want, but not all of the way there. You can get better precision if you tell the computer to use more memory, which we can talk about after class if you like.
There are lots of situations where you'll want to create a certain kind of array, and numpy has functions to help.
You can make an array of consecutive integers from zero to nine with the function np.arange
:
consecutive_integers = np.arange(10)
This function returns different things depending on the amount of arguments that you give it. If there's one number in the parentheses, it tells np.arange
what number to stop at (exclusive). If there are three numbers, they signify np.arange(start, stop, step)
. For example, np.arange(1, 9, 2)
would start at 1, stop before 9, with a step size of 2, so it would return [1, 3, 5, 7]
.
In the cell below, make an array with 10,000 sequential integers, starting with the zero. Save the array into a variable called consecutive_integers
, and print it out:
In [ ]:
You'll see that numpy is polite. It knows you probably don't want to see all ten thousand integers, so it prints just the beginning and end of the array.
The same indexing and slicing rules that we learned for lists work on arrays. In the cell below, print the 42nd element of the consecutive_integers
array, and print the 101-103rd (inclusive) elements of the array:
In [ ]:
Unlike lists, you can do arithmetic with numpy arrays. For example, if you had the following list:
heights = [162, 185, 174, 191]
and you tried to add one to the list, this is what you would get:
print(heights + 1)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-23-ca23c21090bd> in <module>()
1 heights = [162, 185, 174, 191]
----> 2 print(heights + 1)
TypeError: can only concatenate list (not "int") to list
However, this does work if heights
is a numpy array.
In the cell below:
heights_array
, which contains a numpy array of heights
(using the np.array
function we learned above)
In [ ]:
In [ ]:
Notice - numpy arrays don't have to contain numbers (floats and integers). They can be booleans, and other things too!
A boolean is a special type with the value True
or False
.
In [ ]:
In Exercise 4, you found that you can figure out which numbers in the array were greater than 180 all at once. It turns out, if you save that array of booleans:
heights_gt_180 = heights_array > 180
You can use heights_gt_180
like a group of indices on heights_array
to get just the heights where heights_gt_180 == True
.
In the cell below, try:
heights_gt_180 = heights_array > 180
print(heights_array[heights_gt_180])
Did it print out the right indices? What if you flip the greater than to a less than? What if you try ==
instead?
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]: