Introduction to "Doing Science" in Python for REAL Beginners

Python is one of many languages you can use for research and HW purposes. In the next few days, we will work through many of the tool, tips, and tricks that we as graduate students (and PhD researchers) use on a daily basis. We will NOT attempt to teach you all of Python--there isn't time. We will however build up a set of code(s) that will allow you to read and write data, make beautiful publish-worthy plots, fit a line (or any function) to data, and set up algorithms. You will also begin to learn the syntax of Python and can hopefuly apply this knowledge to your current and future work.

Before we begin, a few words on navigating the iPython Notebook:

  • There are two main types of cells : Code and Text
  • In "code" cells "#" at the beginning of a line marks the line as comment
  • In "code" cells every non commented line is intepreted
  • In "code" cells, commands that are preceded by % are "magics" and are special commands in Ipython to add some functionality to the runtime interactive environment.
  • Shift+Return shortcut to execute a cell
  • Alt+Return shortcut to execute a cell and create another one below

Here you can find a complete documentation about the notebook. http://ipython.org/ipython-doc/1/interactive/notebook.html In particular have a look at the section about the keyboard shortcuts.

And remember that :

  • Indentation has a meaning ( we'll talk about this when we cover loops)
  • Indexes start from 0 ( similar to C )

We will discuss more about these concepts while doing things. Let's get started now!!!!

A. Numbers and Calculations

Note: To insert comments to yourself (this is always a great idea), use the # symbol.


In [ ]:
## You can use Python as a calculator:
5*7  #This is a comment and does not affect your code.
#You can have as many as you want. 
#No worries.

In [ ]:
5+7

In [ ]:
5-7

In [ ]:
5/7

These simple operations on numbers in Python 3 works exactly as you'd expect, but that's not true across all programming languages.

For example, in Python 2, an older version of Python that is still used often in scientific programming:

$5/7$ $\neq$ $5./7$

The two calculations below would be equal on most calculators, but they are not equal to Python 2 and many other languages. The first, $5/7$ is division between integers and the answer will be an integer. The second, $5./7$ is division between a float (a number with a decimal) and an integer.

In Python 2,

$5/7$ = $0$

Since division of integers must return an integer, in this case 0. This is not the same for 5./7, which is

$5./7$ = $0.7142857142857143$

This is something to keep in mind when working with programming languages, but Python 3 takes care of this for you.

However, for the sake of consistency, it is best to use float division rather than integer division.

Let's assign some variables and print() them to the screen.


In [ ]:
a = 5
b = 7

In [ ]:
print(a)

In [ ]:
print(b)

In [ ]:
print(a*b , a+b, a/b)

In [ ]:
a = 5.
b = 7
print(a*b, a+b, a/b)

Next, let's create a list of numbers and do math to that list.


In [ ]:
c = [0,1,2,3,4,5,6,7,8,9]
print(c)

How many elements or numbers does the list c contain? Yes, this is easy to count now, but you will eventually work with lists that contains MANY numbers. To get the length of a list (or array), use len().


In [ ]:
len(c)

What exactly is c? It looks like an array because of the square brackets, but it isn't. To see what any variable is, use type().


In [ ]:
type(c) #pick a variable: a, b, or c and type it in the parentheses

Now, some math... Let's square each value in c and put those values in a new list called d. To square a variable (or number), you use **. So $3^{**}2=9$. The rest of the math operations (+ / - x) are 'normal.'


In [ ]:
d = c**2

This should not have worked. Why? The short answer is that a list is very useful, but it is not an array. However, you can convert your lists to arrays (and back again if you feel you need to). In order to do this conversion (and just about anything else), we need something extra.

Python is a fantastic language because it is very powerful and flexible. Also, it is like modular furniture or modular building. You have the Python foundation and choose which modules you want/need and load them before you start working. One of the most loved here at UMD is the NumPy module (pronounced num-pie). This is the something extra (the module), that we need. For more information, see www.numpy.org/.

When we plot this data below, we will also need a module for plotting.

First, let us import NumPy.


In [ ]:
import numpy

To convert our list $c = [0,1,2,3,4,5,6,7,8,9]$ to an array we use numpy.array(),


In [ ]:
c = numpy.array(c)

In [ ]:
d = c**2
print(d)

In [ ]:
type(d)

Great! However, typing numpy over and over again can get tiresome, so we can import it and give it a shorter name. It is common to use the following:


In [ ]:
import numpy as np

In this notation, converting a list to an array would be np.array(c).

B. Our first plot!

Now lets make a quick plot! We'll use the values from our list turned array, $c$, and plot $c^2$.


In [ ]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

In [ ]:
x = c
y = d

p = plt.plot(x,y)

In [ ]:
p = plt.plot(x,y**2)

C. Arrays of numbers

You can also create arrays of numbers using NumPy. Two very useful ways to create arrays are by selecting the interval you care about (say, 0 to 10) and either specifying how far apart you want your values to be (numpy.arange), OR the total number of values you want (np.linspace). Let's look at some examples.


In [ ]:
np.arange(0,10,2) #here the step size is 2. So you'll get even numbers.

In [ ]:
np.linspace(0,10,2) #here you're asking for 2 values. Guess what they'll be!

Next make an array with endpoints 0 and 1 (include 0 and 1), that has 50 values in it. You can use either (both?) np.arange or np.linspace. Which is easier to you? How many numbers do you get? Are these numbers integers or floats (decimal place)?


In [ ]:


In [ ]:

Next make an array with endpoints 0 and 2.5 (include 0 and 2.5), that has values spaced out in increments of 0.05. For example: 0, 0.05, 0.1, 0.15... You can use either np.arange or np.linspace. Which is easier to you? How many numbers do you get? Are these numbers integers or floats (decimal place)?


In [ ]:


In [ ]:

Next, let's plot these two arrays. Call them $a$ and $b$, or $x$ and $y$ (whichever you prefer--this is your code!), for example: a = np.linspace(). Fill in the missing bits in the code below.


In [ ]:
import ___ as np

%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

___ = np.linspace(____)
___ = np.arange(____)

# Clear the plotting field. 
plt.clf() # No need to add anything inside these parentheses. 

plt.plot(__,__,'ro')  # The 'ro' says you want to use Red o plotting symbols.

For all the possible plotting symbols, see: http://matplotlib.org/api/markers_api.html. Next, let's plot the positive half of a circle.


In [ ]:
import ___ as np

%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

x = np.linspace(-1,1,100)
y = np.sqrt(______)

# Clear the plotting field. 
plt.clf() # No need to add anything inside these parentheses. 

plt.plot(x,y,'ro')  # The 'ro' says you want to use Red o plotting symbols. 
plt.xlim([-2,2])
plt.ylim([-2,2])

D. A more complicated function: The Planck Blackbody Curve

Any object at a temperature above absolute zero (at 0 Kelvin, atoms stop moving), emits light of all wavelengths with varying degrees of efficiency. Recall that light is electromagnetic radiation. We see this radiation with our eyes when the wavelength is 400-700 nm, and we feel it as heat when the wavelength is 700 nm - 1mm. A blackbody is an object that is a perfect emitter. This means that it perfectly absorbs any energy that hits it and re-emits this energy over ALL wavelengths. The spectrum of a blackbody looks like a rainbow to our eyes.

The expression for this "rainbow" over all wavelengths is given by,

$$ B(\lambda,T) = \dfrac{2hc^2}{\lambda^5} \dfrac{1}{e^{\frac{hc}{\lambda kT}}-1} $$

where $\lambda$ is the wavelength, $T$ is the temperature of the object, $h$ is Planck's constant, $c$ is the speed of light, and $k$ is the Boltzmann constant.

https://en.wikipedia.org/wiki/Black-body_radiation

Plotting a blackbody curve:

First, let us look at code that might be confusing to read later on.


In [ ]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

x = np.linspace(100,2000,10000)*1e-9  #wavelength, we want a range of 100 nm to 10000 nm, but in METERS

Blam = 2.0*6.626e-34*2.998e8**2/x**5/(np.exp(6.626e-34*2.998e8/(x*1.381e-23*5800.0))-1.0)

plt.clf()
p = plt.plot(x*1e9,Blam)  #we multiply by 1e9 so that the x axis shows nm
xl = plt.xlabel('Wavelength (nm)')
yl = plt.ylabel('Spectral Radiance ()')  #What are the units?

Would this code be easy to edit for other temperatures?

Now, let's look at code that does the same thing, but is more documented:


In [ ]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

# Constants in MKS (meters, kilograms, & seconds)

h = 6.626e-34   # J s 
c = 2.998e8    # m/s
k = 1.381e-23   # J/K

# Let's pick the sun. YOU will need to PICK the temperature and then a range of
# frequencies or wavelengths that "make sense" for that temperature. 
# We know that the sun peaks in visible part of the spectrum. This wavelength 
# is close to 500 nm. Let's have the domain (x values) go from 100 nm to 2000 nm. 
# 1 nm = 10^-9 m = 10^-7 cm. 

lam = np.linspace(100,2000,10000)*1e-9 #wavelength in nm
nu = c/lam
T = 5800.0

exp = np.exp(h*c/(lam*k*T))
num = 2.0 * h * c**2 
denom = lam**5 * (exp - 1.0)

Blam = num/denom

plt.clf()
p = plt.plot(lam*1e9,Blam)
xl = plt.xlabel('Wavelength (nm)')
yl = plt.ylabel(r'Spectral Radiance (W m$^{-3}$)')  #What are the units?

# Try a log-log plot. 
#p = plt.loglog(wav,Bnu)

Plotting Multiple Curves:


In [ ]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

# Constants in MKS (meters, kilograms, & seconds)

h = 6.626e-34   # 
c = 2.998e8    # m/s
k = 1.381e-23   # J/K

# Let's try to recreate the plot above. 
# Pick temperatures: T1 = 7000 K , T2= 5800 K, and T3 = 4000 K. 
# Let's have the domain (x values) go from 100 nm to 2000 nm. 
# 1 nm = 10^-9 m. 

wav = np.linspace(100,2000,10000)*1e-9  #in meters

T1 = 7000.
T2 = 5800.
T3 = 4000.

num = 2.0 * h * c**2 

exp1 = np.exp(h*c/(wav*k*T1))
denom1 = wav**5 * (exp1 - 1.0)

exp2 = np.exp(h*c/(wav*k*T2))
denom2 = wav**5 * (exp2 - 1.0)

exp3 = np.exp(h*c/(wav*k*T3))
denom3 = wav**5 * (exp3 - 1.0)

Bnu1 = num/denom1
Bnu2 = num/denom2
Bnu3 = num/denom3

plt.clf()
p1 = plt.plot(wav*1e9,Bnu1,label='T =7000 K')
p2 = plt.plot(wav*1e9,Bnu2,label='T = 5800 K')
p3 = plt.plot(wav*1e9,Bnu3,label='T = 4000 K')
xl = plt.xlabel('Wavelength (nm)')
yl = plt.ylabel(r'Spectral Radiance (W m$^{-3}$)')
l = plt.legend()

Next, let's have you try an example. We mentioned above that wavelength and frequency are related by the speed of light, $$ c = \lambda \nu. $$

We also described a blackbody in terms of the wavelength. However, we can also describe a blackbody in terms of frequency,

$$ B(\nu,T) = \dfrac{2h\nu^3}{c^2} \dfrac{1}{e^{\frac{h\nu}{kT}}-1}. $$

We can do this because $$ B_\nu d\nu = B_\lambda d\lambda $$ where $$ \nu = \frac{c}{\lambda} \quad \text{and} \quad \frac{d\nu}{d\lambda} = \left| - \frac{c}{\lambda^2}\right|. $$

EXERCISE: Make the same plot above (for 3 separate temperatures) with the x axis showing frequency (in Hertz).


In [ ]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

# Constants in MKS (meters, kilograms, & seconds)

h = 6.626e-34   # 
c = 2.998e8    # m/s
k = 1.381e-23   # J/K

# Let's try to recreate the plot above. 
# Pick three temperatures. 
# Decide on a domain in Hertz (frequency) that makes sense. 
# c = nu x lambda, nu = c/lambda

#### Put your code here ###