Python Crash Course

Ondrej Lexa 2017

Hello! This is a quick intro to programming in Python to help you hit the ground running with the Advanced Structural Geology.

Why Python ?

Python is a modern, general-purpose, object-oriented, high-level programming language. It is the programming language of choice for many scientists to a large degree because it offers a great deal of power to analyze and model scientific data with relatively little overhead in terms of learning, installation or development time. It is a language you can pick up in a weekend, and use for the rest of one's life.

General characteristics of Python

  • clean and simple language: Easy-to-read and intuitive code, easy-to-learn minimalistic syntax, maintainability scales well with size of projects.
  • expressive language: Fewer lines of code, fewer bugs, easier to maintain.

Advantages

  • The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.
  • Well designed language that encourage many good programming practices:
    • Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.
    • Documentation tightly integrated with the code.
  • A large standard library, and a large collection of add-on packages.

The first step in using Python for Stress and strain analyses in geology is to get Python installed w/ the appropriate scientific dependencies.

Installation

Configuration of Python for scientific computing has historically been a huge pain, but recently has gotten a whole lot easier with the advent of two scientific distributions for Python that come with package managers: Canopy by Enthought, and Anaconda by Continuum Analytics. I have used Anaconda extensively, and find it a little easier because it's lighter-weight and is self-contained: it stays in a folder where you place it, doesn't touch the system Python, and doesn't require root/sudo access to install or modify packages. This means it's easy to place on shared computing resources. Best of all, it uses a package manager called conda that is extremely easy to use.

Which one to choose ?

This guide will deal exclusively with Anaconda Python. Anaconda is one of several Python distributions that provide the Python interpreter, together with a list of Python packages and sometimes other related tools, such as editors. The Anaconda Python distribution is easy to install, but other distributions provide similar functionality.

The packages provide by the Anaconda Python distribution includes all of those that we need, and for that reason we suggest to use Anaconda here.

A key part of the Anaconda Python distribution is Spyder, an interactive development environment for Python, including an editor.

Installing Anaconda Python

Installation of the Python interpreter is fairly straightforward, but installation of additional packages can be a bit tedious.

Instead, we suggest to install the Anaconda Python distribution using these installation instructions, which provides the Python interpreter itself and all packages we need.

It is available for download for Windows, OS X and Linux operating systems (and free).

For Windows and OS X you are given a choice whether to download the graphical installer or the next based installer. If you don't know what the terminal (OS X) or command prompt (Windows) is, then you are better advised to choose the graphical version. There are two branches of current releases in Python: the older-syntax Python 2, and the newer-syntax Python 3. Recently, the Python 3 is mature enough, so you want to install the Python 3 version.

Download the installer, start it, and follow instructions. Accept default values as suggested.

(If you are using Linux and you are happy to use the package manager of your distribution -- you will know who you are --, then you may be better advised to install the required packages indivdually rather than installing the whole Anaconda distribution.)

Running Python code

The standard Python interpreter

Start it by typing python on the command line:

$ python
Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:53:06) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

  • It shows an interpreter prompt.
  • You can give it Python code to interpret.

The IPython interpreter

  • Similar to the standard Python interpreter, but with
  • syntax highlighting, tab completion, cross-session history, etcetera...

Start it by typing ipython on the command line:

$ ipython
Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:53:06) 
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]:

Python as a calculator

Position your cursor in the code cell below and hit [shift][enter]. The output should be 42 (btw. The answer to the ultimate question of life, the universe and everything) (-:


In [1]:
2 * (1 + 2 + 3 + 4 + 5 + 6)


Out[1]:
42

In [2]:
3.2 * 18 - 2.1


Out[2]:
55.5

Scientific notation:


In [3]:
1.5e-10 * 1000


Out[3]:
1.5e-07

Python has a number of defined operators for handling numbers through arithmetic calculations, logic operations (that test whether a condition is true or false) or bitwise processing (where the numbers are processed in binary form).

Arithmetic Operations

  • Sum (+)
  • Difference (-)
  • Multiplication (*)
  • Division (/)
  • Integer Division (//)
  • Module (%): returns the remainder of the division.
  • Power (**): can be used to calculate the root, through fractional exponents (eg 100 0.5).
  • Positive (+)
  • Negative (-)

Logical Operations

  • Less than (<)
  • Greater than (>)
  • Less than or equal to (<=)
  • Greater than or equal to (>=)
  • Equal to (==)
  • Not equal to (!=)

Libraries and modules

Python has a huge number of libraries included with the distribution. To keep things simple, most of these variables and functions are not accessible from a normal Python interactive session. Instead, you have to import the name. For example, there is a math module containing many useful functions. To access, say, the square root function, you have to import the math library and than call functions from imported library.


In [4]:
import math
math.sqrt(2)


Out[4]:
1.4142135623730951

Variables

You can define variables using the equals (=) sign:


In [5]:
width = 20
length = 30
area = length*width
area


Out[5]:
600

You can name a variable almost anything you want. It needs to start with an alphabetical character or "_", can contain alphanumeric characters plus underscores ("_"). Certain words, however, are reserved for the language:

and, as, assert, break, class, continue, def, del,
elif, else, except, exec, finally, for, from, global,
if, import, in, is, lambda, not, or, pass, print,
raise, return, try, while, with, yield

Strings

The strings are Python builtins for handling text. String inicialization i.e. to create a new string, can be made:

  • With single or double quotes.
  • In several consecutive lines, provided that it done between three single or double quotes.

In [6]:
'I love Structural Geology!'


Out[6]:
'I love Structural Geology!'

In [7]:
"I love Structural Geology!"


Out[7]:
'I love Structural Geology!'

In [8]:
'''I love
Structural
Geology'''


Out[8]:
'I love\nStructural\nGeology'

But not both at the same time, unless you want one of the symbols to be part of the string.


In [9]:
"He's a geologist"


Out[9]:
"He's a geologist"

In [10]:
'She asked, "Are you crazy?"'


Out[10]:
'She asked, "Are you crazy?"'

Just like the onumbers we're familiar with, you can assign a string to a variable


In [11]:
greeting = "I love Structural Geology!"

The print function is often used for printing character strings:


In [12]:
print(greeting)


I love Structural Geology!

But it can also print data types other than strings:


In [13]:
print("The area is", area)


The area is 600

You can use the + operator to concatenate strings together:


In [14]:
"I " + "love " + "Structural " + "Geology!"


Out[14]:
'I love Structural Geology!'

The operator % is used for string interpolation. The interpolation is more efficient in use of memory than the conventional concatenation.

Symbols used in the interpolation:

  • %s: string.
  • %d: integer.
  • %o: octal.
  • %x: hexacimal.
  • %f: real.
  • %e: real exponential.
  • %%: percent sign.

Symbols can be used to display numbers in various formats.


In [15]:
# Zeros left
print('Now is %02d:%02d.' % (16, 30))

# Real (The number after the decimal point specifies how many decimal digits )
print('Percent: %.1f%%, Exponencial:%.2e' % (5.333, 0.00314))

# Octal and hexadecimal
print('Decimal: %d, Octal: %o, Hexadecimal: %x' % (10, 10, 10))


Now is 16:30.
Percent: 5.3%, Exponencial:3.14e-03
Decimal: 10, Octal: 12, Hexadecimal: a

In addition to interpolation operator %, there is the string method and function format(). Examples:


In [16]:
# Parameters are identified by order
print('The area of square with side {0} is {1}'.format(5, 5*5))

# Parameters are identified by name
print('{greeting}, it is {hour:02d}:{minute:02d}AM'.format(greeting='Hi', hour=7, minute=30))

# Builtin function format()
print('Pi =', format(math.pi, '.15f'))


The area of square with side 5 is 25
Hi, it is 07:30AM
Pi = 3.141592653589793

Slicing strings

Slices of strings can be obtained by adding indexes between brackets after a string.

Python indexes:

  • Start with zero.
  • Count from the end if they are negative.
  • Can be defined as sections, in the form [start: end + 1: step]. If not set the start, it will be considered as zero. If not set end + 1, it will be considered the size of the object. The step (between characters), if not set, is 1.

In [17]:
greeting


Out[17]:
'I love Structural Geology!'

In [18]:
greeting[7]


Out[18]:
'S'

In [19]:
greeting[2:6]


Out[19]:
'love'

In [20]:
greeting[18:]


Out[20]:
'Geology!'

In [21]:
greeting[7:17:2]


Out[21]:
'Srcua'

It is possible to invert strings by using a negative step:


In [22]:
greeting[::-1]


Out[22]:
'!ygoloeG larutcurtS evol I'

Types

Every value or variable in Python has a type. There are several pre-defined (builtin) simple types of data in the Python, such as:

  • Numbers: Integer (int), Floating Point real (float), Complex (complex)
  • Text

You can view type of variable using function type:


In [23]:
type(area)


Out[23]:
int

In [24]:
type(math.sqrt(2))


Out[24]:
float

In [25]:
type(greeting)


Out[25]:
str

Furthermore, there are types that function as collections. The main ones are:

  • List
  • Tuple
  • Dictionary

Python types can be:

  • Mutable: allow the contents of the variables to be changed.
  • Immutable: do not allow the contents of variables to be changed.

The most common types and routines are implemented in the form of builtins, ie, they are always available at runtime, without the need to import any library.

Lists

Very often in a programming language, one wants to keep a group of similar items together. Python does this using a data type called lists.


In [26]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

You can access members of the list using the index of that item:


In [27]:
planets[2]


Out[27]:
'Earth'

The -1 element of a list is the last element:


In [28]:
planets[-1]


Out[28]:
'Neptune'

Lists can be sliced in the same way that the strings.


In [29]:
planets[:4]


Out[29]:
['Mercury', 'Venus', 'Earth', 'Mars']

Lists could collect objects of any type, including other lists.

Whitespace in Python

Python uses indents and whitespace to group statements together. To write a short loop in C, you might use:

for (i = 0, i < 5, i++){
   printf("Stress and strain\n");
}

Python does not use curly braces like C, so the same program as above is written in Python as follows:


In [30]:
for i in range(3):
    print("Stress and strain")


Stress and strain
Stress and strain
Stress and strain

If you have nested for-loops, there is a further indent for the inner loop.


In [31]:
for i in range(3):
    for j in range(3):
        print('i:{} j:{}'.format(i, j))
    
    print("This statement is within the outer i-loop, but not the inner j-loop")


i:0 j:0
i:0 j:1
i:0 j:2
This statement is within the outer i-loop, but not the inner j-loop
i:1 j:0
i:1 j:1
i:1 j:2
This statement is within the outer i-loop, but not the inner j-loop
i:2 j:0
i:2 j:1
i:2 j:2
This statement is within the outer i-loop, but not the inner j-loop

Scientific Python Environment

Python is a high-level open-source language. But the Scientific Python Environment is inhabited by many packages or libraries that provide useful things like array operations, plotting functions, and much more. We can import libraries of functions to expand the capabilities of Python in our programs.
OK! We'll start by importing a few libraries to help us out. In most cases we will use pylab environment, which contains most important parts of numpy and matplotlib. To import it to our notebook, we can use %pylab magic with inline option for inline graphics.


In [32]:
%pylab inline


Populating the interactive namespace from numpy and matplotlib

So what just happened? We just imported most of numpy and matplotlib into current workspace, so their functions are from now available to use. So if we want to use the numpy function linspace, for instance, we can call it by writing:


In [33]:
linspace(-10, 10, 11)


Out[33]:
array([-10.,  -8.,  -6.,  -4.,  -2.,   0.,   2.,   4.,   6.,   8.,  10.])

To learn new functions available to you, visit the NumPy Reference page. If you are a proficient MATLAB user, there is a wiki page that should prove helpful to you: NumPy for Matlab Users

Slicing Arrays

In NumPy, you can look at portions of arrays in the same way as in Matlab, with a few extra tricks thrown in. Let's take an array of values from 1 to 5.


In [34]:
vals = array([1, 2, 3, 4, 5])
vals


Out[34]:
array([1, 2, 3, 4, 5])

Python uses a zero-based index, so let's look at the first and last element in the array myvals


In [35]:
vals[0], vals[4]


Out[35]:
(1, 5)

There are 5 elements in the array vals, but if we try to look at vals[5], Python will be unhappy, as vals[5] is actually calling the non-existant 6th element of that array.


In [36]:
vals[5]


---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-36-9f849f890acf> in <module>()
----> 1 vals[5]

IndexError: index 5 is out of bounds for axis 0 with size 5

Arrays can also be 'sliced', grabbing a range of values. Let's look at the first three elements


In [37]:
vals[0:3]


Out[37]:
array([1, 2, 3])

Note here, the slice is inclusive on the front end and exclusive on the back, so the above command gives us the values of vals[0], vals[1] and vals[2], but not vals[3].

Assigning Array Variables

One of the strange little quirks/features in Python that often confuses people comes up when assigning and comparing arrays of values. Here is a quick example. Let's start by defining a 1-D array called a:


In [38]:
a = linspace(1,5,5)

In [39]:
a


Out[39]:
array([ 1.,  2.,  3.,  4.,  5.])

OK, so we have an array a, with the values 1 through 5. I want to make a copy of that array, called b, so I'll try the following:


In [40]:
b = a

In [41]:
b


Out[41]:
array([ 1.,  2.,  3.,  4.,  5.])

Great. So a has the values 1 through 5 and now so does b. Now that I have a backup of a, I can change its values without worrying about losing data (or so I may think!).


In [42]:
a[2] = 17

In [43]:
a


Out[43]:
array([  1.,   2.,  17.,   4.,   5.])

Here, the 3rd element of a has been changed to 17. Now let's check on b.


In [44]:
b


Out[44]:
array([  1.,   2.,  17.,   4.,   5.])

And that's how things go wrong! When you use a statement like a = b, rather than copying all the values of a into a new array called b, Python just creates an alias (or a pointer) called b and tells it to route us to a. So if we change a value in a then b will reflect that change (technically, this is called assignment by reference). If you want to make a true copy of the array, you have to tell Python to copy every element of a into a new array. Let's call it c.


In [45]:
c = a.copy()

Now, we can try again to change a value in a and see if the changes are also seen in c.


In [46]:
a[2] = 3

In [47]:
a


Out[47]:
array([ 1.,  2.,  3.,  4.,  5.])

In [48]:
b


Out[48]:
array([ 1.,  2.,  3.,  4.,  5.])

In [49]:
c


Out[49]:
array([  1.,   2.,  17.,   4.,   5.])

Plotting

For scientific plotting we will use matplotlib, most commonly plot function.


In [50]:
x = linspace(-pi, pi, 150)
plot(x, sin(x))


Out[50]:
[<matplotlib.lines.Line2D at 0x7f9641631978>]

Learn More

There are a lot of resources online to learn more about using NumPy and other libraries. Well done are Lectures on scientific computing with Python.


In [51]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./css/sg2.css", "r").read()
    return HTML(styles)
css_styling()


Out[51]: