Intro to python

Basic commands

Hello and welcome to the wonderful world of Python. Each of these cells can be copy and pasted into your own notebook. There are code cells and text cells. The code cells execute Python commands. The text cells give some helpful advice along the way. Copy and paste, or retype, the code cells into your own notebook and run them.

This notebook is meant to be a quick introduction to python. We have included some helpful links to other resources along the way. Here are some other tutorials that you can use in your own time: http://introtopython.org/hello_world.html https://www.datacamp.com/courses/intro-to-python-for-data-science

Below is the first code cell.


In [ ]:
# This line is a comment -- it does nothing
# you can add comments using the '#' symbol

You can do math - any output on the last line of the cell will print to the screen


In [ ]:
1+1

In [ ]:
3*5  # this will not print
14 % 3  # modulo (remainder) operator - this will print

You can print anything by passing it to the print function.

Functions: a function takes some arguments and returns one or more values. They can let you write a complicated test and use it over and over again. Call a function by typing its name with parentheses. Arguments for the function go inside the parentheses.


In [ ]:
print(3*5)
print(2**4)  # powers use the double star symbol

Save results to a variable. Variables work like a math equation. The variable name is on the left and whatever it is equal to is on the right.


In [ ]:
output = 1+1

We cab see the type of the output by passing it to the type function.


In [ ]:
type(output)

There are many types in python, but the main ones for values are: 'int' for integers, 'float' for any number with a decimal place, and 'str' for a collection of characters or words. Groups of values can also take on different types, but more on that later.


In [ ]:
type(1.+1.2)

In [ ]:
1.0+1.2

In [ ]:
# we can compare numbers using comparison operators - these return a value of either True or False (boolean type)
1 > 2  # is one greater than two?

Exercise:

What type do you get when you add together a float and an int?

Strings and lists


In [ ]:
# and we can use 'strings' - text 
poem = 'Spiral galaxy; Plane splashed across the night sky; Gives us perspective'

In [ ]:
#We can collect together a bunch of numbers or strings into a list
favorite_primes = [1, 3, 5, 7, 11]

In [ ]:
type(favorite_primes)

In [ ]:
#and access them using square brackets
favorite_primes[0]  # <-- '[0]' will select the first number

In [ ]:
# [-1] will select the last number
favorite_primes[-1]

In [ ]:
# we can also select a range of numbers
favorite_primes[::2]  # <-- select every other element

In [ ]:
favorite_primes[:2]  # select the first two elements

In [ ]:
favorite_primes[2:]  # select from the 3rd element to the last element

Loops


In [ ]:
# we can do things multiple times in a loop:
for prime in my_favorite_primes:  # loop through and get each element of the list 
  print(prime, prime**2)  # print each element and the square of that element

In [ ]:
# for loops are one way to loop - while loops are another
# careful! while loops can sometimes loop forever - check that they have a stopping criteria
i = 0  # start at some value
while i < 10:  # will loop until this condition evaluates to True
  print(i)
  i = i + 1

We can also loop through two lists by using the zip command. More about zip here: https://www.programiz.com/python-programming/methods/built-in/zip (somewhat technical).


In [ ]:
# lets first make a second list
favorite_largenumbers = [10, 300, 5e+5, 7000, 2**32]  # note here that python ints can be very large with no problem

In [ ]:
for large_number, prime in zip(favorite_largenumbers, favorite_primes):
    print(large_number, prime)

Exercise:

Modify the above code to loop through a list of my least favorite numbers AND the primes. What is the largest number that will print below?


In [ ]:
# make a new list that has only four numbers
least_favorite_numbers = [-1, 0, 1, 2]

for bad_number, prime in zip(<MODIFY THIS PART>):
    print(bad_number, prime)

Functions

If we want to do something more complicated we can define a function.


In [ ]:
def square(number):
  # this function will take a number and return its square
  return number**2

print(square(3))

In [ ]:
# to make the function more general we will include a keyword argument - this argument has a default value and can be changed by the user
def raise_to_power(number, power=2):
  return number**power

print(raise_to_power(3))  # with default arguments this will square it
print(raise_to_power(3, power=3))  # with a new argument this will return cubic

Exercise:

What happens when you use the above function, but with a string as the arguments?


In [ ]:
print(raise_to_power(<MODIFY THIS SOMEHOW>))

Dictionaries

In addition to lists, python also has a collection type called a 'dictionary'. These hold what are called key-value pairs. Each key gives a certain value (although note that a value here could be a single number or a list or even another dictionary). These are very useful when you have a bunch of data you want to store.


In [ ]:
definitions = {}  # here we are using the squiggly brackets to make an empty dictionary

We can add to dictionaries by using the square brackets.


In [ ]:
# add an entry for cosmology
definitions['cosmology'] = 'the branch of astronomy that deals with the general structure and evolution of the universe.'
# and for universe
definitions['universe'] = 'the totality of known or supposed objects and phenomena throughout space; the cosmos; macrocosm.'

We can get values out of a dictionary (acessing) by using the square brackets again.


In [ ]:
definitions['cosmology']

We are not limited to strings. Dictionaries can have many types as their keys, and have many types as their values.


In [ ]:
# here we are using the curly braces to make a dictionary of constants. The 'e' syntax is shorthand for 'x10^', so 1e-1 is 0.1
constants_cgs = {'G': 6.67259e-8, 'h': 6.6260756e-27, 'k': 1.380658e-16}

Excercise

What happens when you try and access an entry in a dictionary that doesn't exist?

Numpy and packages

Sometimes we want to go beyond what python can do by default. We can do this by importing 'packages' More resources on numpy: https://docs.scipy.org/doc/numpy-dev/user/quickstart.html


In [ ]:
import numpy as np  # now we have access to a range of new functions that work with numerical data

In [ ]:
# we can make an 'array' -- this is similar to a list but has some advantages
array_of_primes = np.array([1, 3, 5, 7, 11])

In [ ]:
# you can do math on the entire array
array_of_primes + 1

In [ ]:
# CAREFUL: this only works with numpy arrays. This will not work with lists!! Pay attention to the type that you are working with.
# Illustris data uses numpy arrays mostly, but it is always good to check. 
print(type(array_of_primes), type(favorite_primes))

In [ ]:
# We can see some info on the size and shape of the array:
print(array_of_primes.shape, array_of_primes.ndim)

In [ ]:
# and generate arrays with values
array_of_evens = np.arange(2, 12, 2)  # array starting at 2, ending at 12 (exclusive) in steps of 2

We can add arrays together - this will add each element of each array to the corresponding element in the other array. This is type of operation is called an 'element-wise' operation and can save you from having to write loops.


In [ ]:
array_of_primes + array_of_evens

In [ ]:
# we can also compare element-wise:
array_of_evens > array_of_primes

In [ ]:
# we can use these arrays of boolean values to select values of interest from an array
array_of_evens[array_of_evens > array_of_primes]  # select only the even numbers that are greater than corresponding prime numbers

In [ ]:
#Or we can use a 'where' function to get the corresponding indices.
np.where(array_of_evens > array_of_primes)

In [ ]:
indices = np.where(array_of_evens > array_of_primes)
array_of_evens[indices]

Numpy arrays can be multi-dimensional. Lets focus on 2D arrays which are used in illustris. Note: Python is row major meaning that the vertical dimension is accessed first.


In [ ]:
velocities = np.random.rand(15).reshape(5, 3)  # make an array of 15 random values and reshape them into a 2D array with four rows and three columns

In [ ]:
#lets examine the results! This will be different for each person
velocities

We can use indexing to get a certain row or column


In [ ]:
velocities[:, 0]  # get all values (the ':' character) from the first column

In [ ]:
velocities[1, :]  # get all values from the second row

We can compute statistics for the whole array or along different dimension.


In [ ]:
# print the mean value, the max value, and the min value of the array
velocities.mean(), velocities.min(), velocities.max()

In [ ]:
# print the mean in each of the columns - should be a 1D array with three values
velocities.mean(axis=0)

Exercise:

How would you modify the above line the see the average for each row?

Plotting

There are many ways to plot in python. I will show the basics of matplotlib - more information can be found here: https://matplotlib.org/users/pyplot_tutorial.html


In [ ]:
import matplotlib.pyplot as plt  # lets import some plotting tools and give it a helpful name
# this next fancy 'magic line' lets us plot right here in the notebook
%matplotlib inline

In [ ]:
#making a simple plot is easy - just tell the 'plot' function what you x and y values are - by default it makes a line
x = np.arange(10)
y = x**2
plt.plot(x, y)

In [ ]:
# or we can make them points by setting some options
plt.plot(x, y, marker='.', linestyle='none')  # turning the line to none and the marker to a period