Introduction to "Doing Science" in Python for REAL Beginners

Python is one of many languages you can use for research and homework purposes. In the next few days, we will work through many of the tool, tips, and tricks that we as graduate students (and PhD researchers) use on a daily basis. We will NOT attempt to teach you all of Python as there isn't time. We will however build up a set of code(s) that will allow you to read and write data, make beautiful publish-worthy plots, fit a line (or any function) to data, and set up algorithms. You will also begin to learn the syntax of Python and can hopefuly apply this knowledge to your current and future work.

Before we begin, a few words on navigating the iPython Notebook:

  • There are two main types of cells : Code and Text
  • In "code" cells "#" at the beginning of a line marks the line as comment
  • In "code" cells every non-commented line is intepreted
  • In "code" cells, commands that are preceded by % are "magics" and are special commands in IPython to add some functionality to the runtime interactive environment.
  • Shift+Return shortcut to execute a cell
  • Alt+Return (Option+Return on Macs) shortcut to execute a cell and create another one below

Here you can find a complete documentation about the notebook. http://ipython.org/ipython-doc/1/interactive/notebook.html In particular have a look at the section about the keyboard shortcuts. You can also access the keyboard shortcuts from the $\textbf{Help}$ menu above.

And remember that :

  • Indentation has a meaning (we'll talk about this when we cover loops)
  • Indexing starts from 0

We will discuss more about these concepts while doing things. Let's get started now!!!!

A. Numbers, Calculations, and Lists

Before we start coding, let's play around with the Jupyter environment. Make a new cell below using the Alt+Return shortcut.


In [ ]:

Take your newly created cell and write something in it. Switch the type of the cell between a code cell and a text/markdown cell by using the selection box in the top of the screen. See how it changes?

Insert a comment to yourself (this is always a great idea) by using the # symbol.


In [ ]:
## You can use Python as a calculator:
5*7  #This is a comment and does not affect your code.
#You can have as many as you want. 
#Comments help explain your code to others and yourself.
#No worries.

In [ ]:
5+7

In [ ]:
5-7

In [ ]:
5/7

Unfortunately, the output of your calculations won't be saved anywhere, so you can't use them later in your code.

There's a way to get around this: by assigning them to variables. A variable is a way of referring to a memory location used by a computer program that can contain values, text, or even more complicated types. Think of variables as containers to store something so you can use or change it later. Variables can be a single letter (like x or y) but they are usually more helpful when they have descriptive names (like age, stars, total_sum). You want to have a descriptive variable name (so you don't have to keep looking up what it is) but also one that is not a pain to type repeatedly.

Let's assign some variables and print() them to the screen.


In [ ]:
a = 10
b = 7

In [ ]:
print(a)

In [ ]:
print(b)

In [ ]:
print(a*b , a+b, a/b)

You can also write over variables with new values, but your previous values will be gone.


In [ ]:
a = 5
b = 7
print(a*b, a+b, a/b)

Next, let's create a list of numbers. A list is a way to store items in a group.


In [ ]:
numList = [0,1,2,3,4,5,6,7,8,9]
print(numList)

How many elements or numbers does the list numList contain? Yes, this is easy to count now, but you will eventually work with lists that contains MANY items. To get the length of a list, use len().


In [ ]:
L = len(numList)
print(L)

You can also access particular elements in an array by indexing. The syntax for this is the following:

numList[index_number]

This will return the value in the list that corresponds to the index number.

Arrays are numbered starting from 0, such that

  • First position = 0 (or 0th item)
  • Second position = 1
  • Third position = 2
  • etc.

It is a bit confusing, but after a bit of time, this becomes quite natural. For example, getting the 4th item in the list you would need to type:

numList[4]

Try accessing elements of the list you just created:


In [ ]:
# Insert your code below:

How would you access the number 5 in numList?


In [ ]:
# Insert your code below:

Let's try making more complicated list:


In [ ]:
fibList = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

In [ ]:
fibList[5]

Now let's say you create a list of numbers, but later on you want to add more numbers to the list. We can do that! We can quite literally add them to the list like so:


In [ ]:
addList = [1, 1, 5, 4, 6, 7, 3, 2, 8]
print(addList)

# Now let's add some new numbers to the list
addList = addList + [4, 3, 2, 6]
print(addList)

See how the list changed to now include the new numbers?

Now you know the basics of Python, let's see how it can be used as a graphing calculator

B. Our first plot!

Python is a fantastic language because it is very powerful and flexible. It is like modular furniture or a modular building. You have the Python foundation and choose which modules you want/need and load them before you start working. One of the most loved here at UMD is the matplotlib (https://matplotlib.org/), which provides lots of functionality for making beautiful, publishable plots.


In [ ]:
# Run this code
%matplotlib inline  
# this "magic" command puts the plots right in the jupyter notebook
import matplotlib

When using modules (also sometimes called libraries or packages ) you can use a nickname through the as keyword so you don't have to type the long module name every time. For example, matplotlib.pyplot is typically shortened to plt like below.


In [ ]:
# Run this code
import matplotlib.pyplot as plt

Now let's do a quick simple plot using the list we defined earlier!


In [ ]:
x = numList
y = numList

p = plt.plot(x, y)

You can change a lot of attributes about plots, like the style of the line, the color, and the thickness of the line. You can add titles, axis labels, and legends. You can also put more than one line on the same plot. This link includes all the ways you can modify plots: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.plot.html.

Let's take a quick look at the stuff we can find in the matplotlib documentation, as it can be a little overwhelming to navigate.

Here is a quick example showing a few of the things you can do with matplotlib:


In [ ]:
# Clear the plotting field. 
plt.clf() # No need to add anything inside these parentheses. 

# First line
plt.plot(x, y, color='blue', linestyle='-', linewidth=1, label='num')

# Second line
z = fibList
# you can shorten the keywords like "color" to be just "c" for quicker typing
plt.plot(x, z, c='r', ls='--', lw=3, label='fib')

# add the labels and titles
plt.xlabel('x values')
plt.ylabel('y values')
plt.title('My First Plot')
plt.legend(loc='best')

#Would you like to save your plot? Uncomment the below line. Here, we use savefig('nameOffigure')
#It should save to the folder you are currently working out of.
#plt.savefig('MyFirstFigure.jpg')

EXERCISE 1:

Create two lists of numbers: list1 will be the integers from 0 to 9 and list2 will be the elements of list1 squared.

Plot the two lists with matplotlib and make some changes to the color, linestyle, or linewidth.

Add labels, a title, and a legend to your plot.

Save the plot once you are done.

Be creative and feel free to look up the different linestyles using the link above.


In [ ]:
# Insert your code below:

C. Logic, If/Else, and Loops

Let's now switch gears a bit and discuss logic in Python. Conditional (logic) statements form the backbone of programming. These statements in Python return either True or False and have a special name in programming: Booleans. Sometimes this type of logic is also called Boolean logic.


In [ ]:
#Example conditional statements
x = 1
y = 2
x < y #x is less than y

Think of the statement $x<y$ as asking the question "is x less than y?" If it is, then it returns True and if x is not less than y it returns False.


In [ ]:
#x is greater than y
x > y

In [ ]:
#x is less-than or equal to y
x <= y

In [ ]:
#x is greater-than or equal to y
x >= y

If you let a and b be conditional statements (like the above statements, e.g. a = x < y), then you can combine the two together using logical operators, which can be thought of as functions for conditional statements.

There are three logical operators that are handy to know:

  • And operator: a and b
    • outputs True only if both a and b are True
  • Or operator: a or b
    • outputs True if at least one of a and b are True
  • Not operator: not(a)
    • outputs the negation of a

In [ ]:
#Example of and operator
(1 < 2) and (2 < 3)

In [ ]:
#Example of or operator
(1 < 2) or (2 > 3)

In [ ]:
#Example of not operator
not(1 < 2)

Now, these might not seem especially useful at first, but they're the bread and butter of programming. Even more importantly, they are used when we are doing if/else statements or loops, which we will now cover.

An if/else statement (or simply an if statement) are segments of code that have a conditional statement built into it, such that the code within that segment doesn't activate unless the conditional statement is true.

Here's an example. Play around with the variables x and y to see what happens.


In [ ]:
x = 1
y = 2
if (x < y):
    print("Yup, totally true!")
else:
    print("Nope, completely wrong!")

The idea here is that Python checks to see if the statement (in this case "x < y") is True. If it is, then it will do what is below the if statement. The else statement tells Python what to do if the condition is False.

Note that Python requires you to indent these segments of code, and WILL NOT like it if you don't. Some languages don't require it, but Python is very particular when it comes to this point. (The parentheses around the conditional statement, however, are optional.)

You also do not always need an "else" segment, which effectively means that if the condition isn't True, then that segment of code doesn't do anything, and Python will just continue on past the if statement.

Here is an example of such a case. Play around with it to see what happens when you change the values of x and y.


In [ ]:
x = 2
y = 1
if (x > y):
    print("x is greater than y")

Here's a more complicated case. Here, we introduce some logic that helps you figure out if two objects are equal or not.

There's the == operator and the != operator. Can you figure out what they mean?


In [ ]:
x = 2
y = 2
if (x == y):
    print("x and y are equal")
if (x != y):
    print("x and y are not equal")
if (x > y or x < y):
    print("x and y are not equal (again!)")

Loops

While-loops are similar to if statements, in the sense that they also have a conditional statement built into them. The code inside the loop will execute when the conditional is True. And then it will check the conditional again and, if it evaluates to True, the code will execute... again. And so on and so forth...

The funny thing about while-loops is that they will KEEP executing that segment of code until the conditional statement evaluates to False...which hopefully will happen...right?

Although this seems a bit strange, you can get the hang of it!

For example, let's say we want Python to count from 1 to 10.


In [ ]:
x = 1
while (x <= 10):
    print(x)
    x = x + 1

Note here that we tell Python to print the number x (x starts at 1) and then redefining x as itself +1 (so, x=1 gets redefined to x = x+1 = 1+1 = 2). Python then executes the loop again, but now x has been incremented by 1. We continue this process from x = 1 to x = 10, printing out x every time. Thus, with a fairly compact bit of code, you get 10 lines of output.

It is sometimes handy to define what is known as a DUMMY VARIABLE, whose only job is to count the number of times the loop has been executed. Let's call this dummy variable i.


In [ ]:
x = 2
i = 0 #dummy variable
while (i<10):
    x = 2*x
    print(x)
    i = i+1

Now we want to combine lists with loops! You can use the dummy variable as a way to access a value in the list through its index. In exercise 1 we asked you to square the elements in a given list by hand, let's now do it by using a loop. The setup for the loop is provided below but try completing the code on your own!

In Python, the command to square something is **. So 3**2 will give you 9.


In [ ]:
myList = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# we want to end the loop at the end of the list i.e., the length of the list
end = len(myList)

# your code here

Isn't that much easier than squaring everything by hand? Loops are your friends in programming and will make menial, reptitive tasks go by very quickly.

All this information with logic, loops, and lists may be confusing, but you will get the hang of it with practice! And by combining these concepts, your programming in Python can be very powerful. Let's try an example where we use an if/then nested inside of a loop by finding how many times the number 2 shows up in the following list. Remember that indentation is very important in Python!


In [ ]:
twoList = [2, 5, 6, 2, 4, 1, 5, 7, 3, 2, 5, 2]
count = 0  # this variable will count up how many times the number 2 appears in the above list
end = len(twoList)
i = 0

while i < end:
    if twoList[i] == 2:
        count = count + 1
        
    i = i + 1
        
print(count)

Notice how the indentation is set up. What happens if you indent the print statement? How about removing the indentation on the if statement? Play around with it so you get the hang of indentation in nested code.

If you are ever lost when writing or understanding a loop, just think through each iteration one by one. Think about how the variables, especially the dummy variables, are changing. Printing them to the screen can also help you figure out any problems you may have. With some more practice, you will be a loop master soon!

EXERCISE 2: Truth Table

A truth table is a way of showing the True and False values of different operations. This seems abstract and unuseful, but they can be really great ways to 'mask', or block out, data that is not needed. They typically start with values for two variables and then find the result when they are combined with different operators. Using two separate while-loops, generate two columns of a truth table for the lists x and y, defined below. That is, find the values for each element in the list for x and y in one loop and then the values of x or y in another. Note: you can always check your answer by doing it in your head! Checking your work is a good habit to have for the future :)


In [ ]:
x = [True, True, False, False]
y = [True, False, True, False]

# Insert your code here below: