much of this material is based on notebook's from Jake Vanderplas' Intro to Scientific Computing in Python course

What is Python?

Python is an open source, interpreted, and object-oriented programming language. What does this mean? It means it can be freely used and modified by others, that it does not need to be compiled to run, and that it uses the concept of data structures called "objects" that have both attributes (data) and methods (procedures).

Why Python?

Python is a great first language to learn because you do not have to worry about things like assigning types to your variables or memory allocation, and in addition the syntax is generally very readable for first-time users. Python is also a portable skill-set--there is LOTS that Python can do both within the field of astronomy and also outside of it. Note that there are some "cons" to Python which require some more clever workarounds. For example, because Python is an interpreted language it can be slow compared to something like C.

Ways to Use Python:

  • IPython Notebook (type ipython notebook to start in terminal and ctrl-c to exit)
    • In a given cell, type Enter to add new lines, Ctrl-Return or Shift-Return to run a line, and Alt-Return to run a line and add a new cell)
  • Python command line interpreter (type python to start in terminal and ctrl-d to exit)
  • IPython command line interpreter (type ipython to start in terminal and exit to exit)
  • Making and editing .py files in your favorite text editor (gedit, vim, emacs, nano, et al.)
    • You can also use Python IDEs (integrated development environment) once you get more comfortable, like Spyder or PyCharm

For this class we will be using Python 2.7

Hello, World!

This is one the most simple programs to write in any language, so it's what we'll learn first. In Python it's extra easy, and there are a couple of different ways you could go about it. Let's first try it interactively, and then see how we might run it from a .py file instead.


In [ ]:
print "hello, world!"

In [ ]:
%%bash
echo "print 'hello, world!'" > hello.py # write our .py file 
cat hello.py # print the contents of this file to the screen
python hello.py # run the python script

Basic Cheat Sheet

Variable Types in Python

  • Numbers
    • int (integers, limited to 64bit representation), i.e. 10
    • float (floating point real values), i.e 6.2
    • long (long integers, limited only to available memory), i.e 10L
    • complex (complex numbers), i.e. 0.5j
  • String, i.e. "hello, world!"
  • List, i.e. [1,'star', 3, 'planet', 7]
  • Tuple, i.e. (1, 'star', 3, 'planet',7)
    • think of this as a "read-only" list
  • Dictionary, i.e. {'star': 'sun', 'planet': 'earth'}
    • think of dictionaries as "key-value" pairs

Arithmetic in Python

  • + : addition
  • -: subtraction
  • /: division
  • *: multiplication
  • %: modulus (remainder)
  • **: exponentiation

Comparisons and Boolean Operators in Python

  • ==, !=: equal to, not equal to
  • <, <=: less than, less than or equal to
  • >, >=: greater than, greater than or equal to
  • or, i.e A or B: true if either A or B or both are true
  • and, i.e. A and B: true only if both are true
  • not, i.e. not A: true only if A is false

Basic Data Types in Python


In [ ]:
# Beginning a line with "#" is a comment that is ignored by python
myint = 8 # assigns the integer value to the variable myint
print myint # shows the value assigned to myint on the screen
print type(myint) # shows the variable type of myint, note that we did not have to specify this! python got it.

NOTE: in an IPython Notebook, or IPython itself you don't need to use print to get the value of the variable to show on the screen, you only need to type the variable itself. Try it!


In [ ]:
myfloat = 8. # note the difference in the trailing "." for a float. This is important for calculations!
print myfloat
print type(myfloat)

In [ ]:
mycomplex = 3.0 + 4.1j # note that the imaginary part of complex numbers get a trailing "j"
print mycomplex
print type(mycomplex)

In [ ]:
mystring = "stars are so cool" # assigns the string to the variable mystring
print mystring 
print type(mystring)

In [ ]:
mylist = [1,5,'star','9', 'planet'] # lists can have mixed variable types. 
                                    #they should be comma separated and in square brackets
print mylist
print type(mylist)

In [ ]:
otherlist = [mylist, 'earth', 'mars'] # you can put lists inside lists!
print otherlist # you can see from the output that mylist stores  the list above and acts as shorthand

In [ ]:
mylist.append(42) # lists can be appended to. we call this a method of the list object 
print mylist

In [ ]:
mylist.pop(0) # lists can have items deleted from this, in this case the first item in the list
print mylist

In [ ]:
mylist.sort() # lists can also be sorted or reverse sorted 
print mylist

In [ ]:
mytuple = (1,5,'star','9','planet') # tuples are like read-only lists, and use parenthese, NOT square brackets 
print mytuple
print type(mytuple)

In [ ]:
mydictionary = {'star': 'sun', 'planet': 'earth', 'satellite': 'moon'} # dictionaries are enclosed in curly brackets 
                                                                        # are made up of key value pairs 
print mydictionary
print type(mydictionary)

In [ ]:
print mydictionary['star'] # look up a value in a dictionary by keyword. here keyword is "star" and value is "sun"
print mydictionary.keys() # print all keys in dictionary
print mydictionary.values() # print all values in dictionary

In [ ]:
mydictionary = {'list1': mylist, 'list2': otherlist} # we can assign lists to be values in a dictionary!
print mydictionary['list1'] # finds the value (in this case a list) associated with key "list1"

Arithmetic Operations and Variable Assignment


In [ ]:
print 2 + 2 # basic addition of integers in python

In [ ]:
print 2*3 # basic multiplication of integers in python

In [ ]:
print 19 - 7 # basic subtraction of integers in python

In [ ]:
print 6 / 3 # basic division of integers in python

In [ ]:
print 6 / 5 # in python 2.7 you have to be careful about using integers for division! integer operations give 
            # you back an integer, NOT a float.

In [ ]:
# It is usually safe just to use floats when doing division if you don't want to get an error like above.
print 6 / 5. # remember the trailing "." tells python we want to use a float

In [ ]:
print 10 % 6 # this is the modulus (remainder) operator

In [ ]:
print 2**2 # basic exponentiation

In [ ]:
print 3.154e+7 # note that you can do scientific notation as well
               # this is the same as 3.154*10**7

We can also do arithmetic on strings!


In [ ]:
subject = "ASTRO"
course = "192"
print subject+course

In [ ]:
print "there are" + 3.154e+7 + "seconds in a year" # why won't this work?

In [ ]:
print "\t there are \n " + str(3.154e+7) + " seconds in a year" # need to make it a string. escape characters
                                                            # like \n and \t do things like "newline" and "tab"

In [ ]:
print "o*"*50

Instead of doing all our calculations like this in Python, let's look at how we can make things easier using variable assignment.


In [ ]:
c = 3.0*10**5 # speed of light (c) in km/s
diameter_lyr = 120000 # diameter of mw in lyr 
s_per_yr = 3.154e+7 # number of seconds in a year 
diameter_km = diameter_lyr*c*s_per_yr # diameter of mw in km 
print diameter_km

Or, we can make this look even simpler with a nice trick in Python.


In [ ]:
c, diameter_lyr, s_per_yr = 3.0*10**5,120000, 3.154e+7 # define all variables on one line 
diameter_km = diameter_lyr*c*s_per_yr # diameter of mw in km 
print diameter_km

Another neat arithmetic trick is doing "operate-and-assign," which will likely become more important for loops.


In [ ]:
y = 3
y += 3 # y = y + 3
print y

Comparison Operators and Boolean Variables


In [ ]:
mass_of_sun = 1.989*10**30 # mass of sun in kg
mass_of_earth = 5.972*10**24 # mass of earth in kg
mass_of_earth < mass_of_sun # here we are making a comparison that is evaluated to true

In [ ]:
mass_of_sun == mass_of_earth # this comparison operator means "equal to," note that it is NOT the same as "=" which
                            # assigns a value to a variable

In [ ]:
mass_of_sun != mass_of_earth # this comparison operator means "not equal to"

We can also string together multiple inequalities:


In [ ]:
mass_of_moon = 7.35*10**22 # mass of moon in kg
mass_of_moon < mass_of_earth < mass_of_sun

Note that you should be careful about doing comparisons on floating point values since these are stored in a specific way.


In [ ]:
0.1 + 0.2 == 0.3 # this equality returns False, why?

In [ ]:
print "{0:.20f}".format(0.1 + 0.2) # don't worry about the print statements now; they tell us how many decimals to print 
print "{0:.20f}".format(0.3) # clearly these two are not equal! careful with floats.

Now that we've seen the "Boolean" variables True and False in Python let's take a look at logical operators that can test these Boolean variables.


In [ ]:
(mass_of_moon < mass_of_earth) and (mass_of_earth < mass_of_sun) # both MUST be true for this to return true

In [ ]:
(mass_of_moon > mass_of_earth) or (mass_of_earth < mass_of_sun) # only one must be true for this to return true

In [ ]:
(mass_of_moon > mass_of_earth) or not (mass_of_earth > mass_of_sun) # what do you expect this to return?

In [ ]:
(mass_of_moon > mass_of_earth) or (mass_of_earth > mass_of_sun) # what do you expect this to return?

Importing Modules and Using Built-In Functions:

Using Arrays in Python

A lot of functionality in Python comes from being able to use pre-existing modules and the functions therein to perform specific operations. We will go through the syntax for doing this, below, and in addition talk about some of the most useful modules for astronomy that exist in Python. We will talk about building our own functions in a later lesson. A module is just an organized piece of code (.py files are treated as modules, for example).

To import a module you simply type the following:

import module

To then use a function from this module you would do the following:

module.function(x)

Where x is going to be whatever the function takes as its argument. There may be more than one argument that the function takes, in which case you could have function(x,y,z).

Modules may also have sub-modules that in turn have their own functions you want to use, in which case you would type this:

from module1 import module2

module2.function(x)

Lastly, you can change the name of a module in your code for ease of typing if it is something you use quite often. For example:

import module as mod

mod.function

Let's take a look at a concrete example for one of the most useful modules you will come across in python. This module is called "numpy" and makes using arrays (which we will discuss momentarily) and doing mathematical operations on these arrays very simple.


In [ ]:
import numpy as np # import the numpy module so that we can use all it's built-in functions. shorten the name for ease.

In [ ]:
myarray = np.zeros(5) # this is a built-in function from the module numpy that creates an array of zeros
                      # the size of the array is the argument to zeros, in this case 5

In [ ]:
print myarray # the default type for values IN the array from np.zeros is float 
print myarray.dtype

In [ ]:
myarray = np.zeros(5, dtype='int') # but we can change the type of value inside an array in this way
print myarray.dtype

In [ ]:
print myarray.shape # you can also get other properties of your array in this way (like dtype above)
print myarray.size

In [ ]:
print myarray.sum() # there are also methods like sum, mean, min, and max for arrays

Sidebar: how are arrays different from lists that we have seen above? We have seen this lists can store heterogenous data (data that contains different types). Arrays, on the other hand, should store homogenous data (data of the same type), and are usually used for storing things that you want to perform fast mathematical operations on. Arrays can speed things up a lot in Python as we will see later. Arrays are NOT comma separated like a list, but the elements of an array can be accessed just like the individual elements of a list can (again, as we will see later).


In [ ]:
myarray + 2 # we can do mathematical operations on arrays for the WHOLE array at once! very powerful.

In [ ]:
myarray = myarray + 10 # note that the above cell did not change the value of the whole array because we didn't 
                       # assign that operation to any variable. now we have done this

In [ ]:
np.log10(myarray) # this is another built-in function from numpy that allows you to take the log
                  # of the whole array (in base 10)

In [ ]:
help(np.zeros) # using this help function is much like "man" in bash. it will tell us more about this function

You can also make arrays in Python which are not strictly 1-d. For example:


In [ ]:
mymatrix = np.zeros((5,3)) # python uses row-column notation, so this creates a matrix of five rows and three columns
print mymatrix

So we have seen now how to create arrays and lists and how to do operations on these as a whole, but how do we access individual elements of these arrays or lists? To do this we need to learn about array and list indexing and how to "slice" arrays in Python. This is a pictorial represenation of how indexing and slicing works with python arrays. Notice how it starts with ZERO. Python is a "zero-indexing" language.


In [ ]:
myarray[2] = 80. # this assigns a value to the THIRD element of my array. you can see that it's not different! 
myarray

In [ ]:
myarray[:3] = 0. # this assigns the value to the first through third (NOT including fourth) elements of my array
                 # this is an example of array slicing
myarray

Let's look at a more complicated way to slice arrays:


In [ ]:
bins = np.zeros(5) + np.arange(5) # this creates an array of zeros, and then uses another built in function of numpy
                               # called arange to populate each element of this array with different values
bins # notice how arange gives you the "range" of the input value 5, but starts again with zero.

In [ ]:
bins = np.zeros(5) + np.arange(6) # why won't this work?

In [ ]:
len(bins) # you can always check the size of your array using "len"

In [ ]:
bins = np.arange(5) # note that there are multiple ways to create this array, all of which work fine
bins

In [ ]:
bincenters = (bins[1:] + bins[:-1])*0.5 # slicing the array in this way gives me the "centers" of my previous values
bincenters

In [ ]:
# let's try to see in detail how this slicing works, step-by-step:
print "first bin slice: {0}".format(str(bins[1:]))
print "second bin slice: {0}".format(str(bins[:-1]))
print "bin slices added: {0}".format(str(bins[1:] + bins[:-1]))
print "bin centers: {0}".format(str(bincenters))

Now that we know more about indexing and slicing, we can even access individual elements of matrices!


In [ ]:
mymatrix # recall we made this 5x3 matrix earlier

In [ ]:
mymatrix[0] = 6. # assigns the value 6 to first ROW of the matrix 
mymatrix[0,0] = 10. # assigns the vlaue 10 to the first element of the first column in the first row
mymatrix

In [ ]:
print mymatrix[:,0] # accesses first column of matrix 
print mymatrix[0,:] # accesses first row of matrix

Now that we know more about array slicing and indexing, let's look at another powerful function of numpy called "where:"


In [ ]:
arr = np.linspace(0,35) # another built-in numpy function for making evenly spaced array over given interval
print np.where(arr > 10.) # here we have used where to find the INDICES where the array is greater than 10

In [ ]:
indices = np.where(arr > 10.) # assign the variable indices with these values from above 
arr[indices] # now reindex our array with these indices and we get the VALUES at those indices

In [ ]:
arr[np.where(arr > 10.)] # you can also skip a step above and write it like this

In [ ]:
arr[np.where(arr >= np.max(arr))] # what do you think this will do? combines the operators we learned before

We simply do not have enough time to go over all the different methods you can use on arrays or all the different functions built-in to numpy, but you can look up plenty of documentation for these yourself online. See here for some useful numpy documentation or use help() as I showed before.


In [ ]: