Why Python

For this comparison I'm going to assume most of you are primarily Matlab users. Matlab is great, especially in a university environment. It's an easy to use, interpreted, high-level language with automatic memory management and lots of supporting libraries. Python shares those same advantages. What's the difference then? Python has some benefits that may become more important to you as your problems get larger, or as you move into industry. Some of these benefits include:

  1. It's free. Matlab is cheap as a student, but it's definitely not cheap once you are out of the university setting.
  2. It's open source. Often as you advance in your research you will need to dive into the details of an algorithm and may need to make some modifications. That's not possible with most of Matlab's libraries.
  3. Performance. Matlab is not fast. To be clear, Python is also an interpreted language so it isn't fast either. But with Python it is easy to wrap C and Fortran code. In fact that is one of its primary uses in scientific omputing. It acts as a "glue" code between different codes. We've used it to connect a CFD code in C++, blade design tools in Fortran 95, a structural FEM code in C++, a cost tool in C, other cost tools in pure Python, and an optimizer in Fortran 77, etc. The workhorse code remains in a compiled langauge, but you can interact with and connect the tools in a simple, concise, scriptable language. You get both speed and ease of use! Matlab does have mex files, but that approach is much more complicated. A common workflow with Python is to write everything in pure Python, then once tested, profile and move bottlenecks in the code to C or Fortran if necessary.
  4. Parallelization. This is related to 2. Matlab does support parallel computing, but that requires an additional expensive license to the parallel computing toolbox, and is not as capable. Similar comment for cloud computing.
  5. Unlike Matlab, Python is a full featured programming langauge. In additional to procedural style it also supports object oriented and functional styles. (Matlab has some OO support, but it is very weak). Python has dictionaries, package managment, modern string support, cloud computing, error handling, connections to web servers, etc.

Are there any drawbacks? The only one I've come across is Simulink. I don't know of anything of comparable capability in the Python world. But, since this is a fluid dynamics conference, probably none of us is worried about that.

Installation

  • If you are on Linux or OS X you should already have Python. Using the system version is fine for testing the waters, but if you are going to upgrade versions or start using different packages you should install your own version of Python rather than rely on the system version.
  • There are lots of ways to install Python. I recommend using conda by downloading Anaconda or Miniconda. See here for helping deciding which one to select.

Editing

  • I just use a text editor, either ST3 or Atom. I like using a text editor for all my languages because I heavily rely on keyboard shortcuts.
  • But for getting started, I would probably recommend using an IDE. There are many. PyCharm seems to be well-liked by my students.
  • IPython notebooks (now Jupyter notebooks), of which this notebook is an example, is another great option. They are useful for combining code along with text (with LaTeX support). Many people love using it. I like it for demos like this, but don't like using it for development (again, because I like my keyboard shortcuts). If you want to go the notebook route you should definitely download it locally, rather than using this remotely hosted server that we are using for the demo.

Useful Packages

Some packages you may want to install include (you will definitely need the first three):

  • NumPy: array methods
  • SciPy: integration, root finding, optimization, linear algebra, etc.
  • Matplotlib: plotting package (also see Bokeh, Plotly, etc.)
  • IPython: interactive shells and notebook
  • pandas: data structure and data analysis tools (like R)
  • scikit-learn: machine learning
  • many others...

Tutorials

Many exist, and I'm not familiar enough with them all to recommend one over another. I like the one in the official Python docs. It's probably more detailed than you want for a first exposure, but it is a good resource.

Matlab Users

Here are two useful resources called NumPy for Matlab Users: one, two.

Examples

Today we will go through two simple examples to introduce you to the syntax and to show how to wrap an external Fortran/C code. If you are viewing this in nbviewer I suggest you download it first (button in upper right corner), then open it up at https://try.jupyter.org so you can edit and follow along. Later you can install IPython so you can make edits locally rather than on a remote server. First, evaluate the cell below. Press shift-enter to evaluate a cell. This cell imports some libraries we will need. The first line is only for IPython (you wouldn't use it in a normal Python script). It just tells IPython that we want to see plots inline. The next line imports numpy, which contains a lot of important array operations. The last imports matplotlib, which is a plotting package.


In [2]:
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt

Simple UAV drag curves

We are going to compute induced and parasite drag (in a very basic way) as an example to introduce scripting. \begin{align} q&= \frac{1}{2} \rho V_\infty^2 \\ D_i &= \frac{L^2}{q \pi b^2 e} \\ D_p &= {C_D}_p q S \end{align}

The first few lines make some imports. The first line is only for Jupyter notebooks. It just tells the notebook to show plots inline. You wouldn't need that in a normal python script.


In [3]:
from math import pi

# all in standard English units

# atmosphere
rho = 0.0024  # air density

# geometry
b = 8.0  # wing span
chord = 1.0

# mass properties
W = 2.4  # total weight of aircraft

# other parameters
e = 0.9  # Oswald efficiency factor
CDp = 0.02  # could compute but just input for simplicity

# an array of wind speeds
V = np.linspace(10, 30, 100)

# Induced drag
q = 0.5*rho*V**2  # dyamic pressure
L = W  # equilibrium flight
Di = L**2/(q*pi*b**2*e)

# parasite drag
S = b*chord
Dp = CDp*q*S

# these next 3 lines purely for style in the plot (loading a predefined styleshet)
# I have my own custom styles I use, but for this example let's use one of matplotlib's
plt.style.use('ggplot')
plt.rcParams.update({'font.size': 16})
colors = plt.rcParams['axes.color_cycle']  # grab the current color scheme

# plot it
plt.figure()
plt.plot(V, Di)
plt.plot(V, Dp)
plt.plot(V, Di+Dp)
plt.xlabel('V (ft/s)')
plt.ylabel('Drag (lbs)')

# label the plots
plt.text(25, 0.06, 'induced drag', color=colors[0])
plt.text(12, 0.06, 'parasite drag', color=colors[1])
plt.text(20, 0.17, 'total drag', color=colors[2])


Out[3]:
<matplotlib.text.Text at 0x1140c4550>

Try it yourself. Let's do the same calculation, but with reusable functions. In Python functions are easy to define. A simple example is below (note that unlike Matlab, you can have as many functions as you want in a file.)


In [4]:
def func(x, y):
    add = x + y
    mult = x * y
    return add, mult


a, m = func(1.0, 3.0)
print 'a =', a, 'm =', m

a, m = func(2.0, 7.0)
print 'a =', a, 'm =', m


a = 4.0 m = 3.0
a = 9.0 m = 14.0

In [5]:
def induced_drag():
    pass
    
def parasite_drag():
    pass
    

# atmosphere
rho = 0.0024  # air density

# geometry
b = 8.0  # wing span
chord = 1.0

# mass properties
W = 2.4  # total weight of aircraft

# other parameters
e = 0.9  # Oswald efficiency factor
CDp = 0.02  # could compute but just input for simplicity

# wind speeds
V = np.linspace(10, 30, 100)

Finally, let's do it once more, but in an object-oriented style.


In [6]:
class UAV(object):
    
    def __init__(self, b, chord, W, rho):
        self.b = b
        self.S = b*chord
        self.L = W
        self.rho = rho
        
    def induced_drag(self, V, e):
        q = 0.5*self.rho*V**2
        Di = self.L**2/(q*pi*self.b**2*e)
        return Di

    def parasite_drag(self, V, CDp):
        q = 0.5*self.rho*V**2
        Dp = CDp*q*self.S
        return Dp


    
# atmosphere
rho = 0.0024  # air density

# geometry
b = 8.0  # wing span
chord = 1.0

# mass properties
W = 2.4  # total weight of aircraft

# setup UAV object
uav = UAV(b, chord, W, rho)

# setup sweep
V = np.linspace(10, 30, 100)

# idrag
e = 0.9  # Oswald efficiency factor
Di = uav.induced_drag(V, e)

# pdrag
CDp = 0.02  
Dp = uav.parasite_drag(V, CDp)


# style
plt.style.use('fivethirtyeight')

# plot it
plt.figure()
plt.plot(V, Di)
plt.plot(V, Dp)
plt.plot(V, Di+Dp)
plt.xlabel('V (ft/s)')
plt.ylabel('Drag (lbs)')

# label the plots
colors = plt.rcParams['axes.color_cycle']
plt.text(25, 0.06, 'induced drag', color=colors[0])
plt.text(12, 0.06, 'parasite drag', color=colors[1])
plt.text(20, 0.17, 'total drag', color=colors[2])


Out[6]:
<matplotlib.text.Text at 0x114256a50>

Wrapping Fortran/C

This next example is going to solve Laplace's equation on a grid. First, we will do it in pure Python, then we will rewrite a portion of the code in Fortran and call it in Python for improved speed. Recall Laplace's equation:

$$ \nabla^2 \phi = 0 $$

where $\phi is some scalar. For a regular rectangular grid, with equal spacing in x and y, you might recall that a simple iterative method for solving this equation consists of the following update rule:

$$ \phi_{i, j} = \frac{1}{4} (\phi_{i+1, j} + \phi_{i-1, j} + \phi_{i, j+1} + \phi_{i, j-1})$$

In other words, each cell updates its value using the average value of all of its neighbors (note that they are much more efficient ways to solve Laplace's equation on a grid. For our purpose we just want to keep things simple). This process must be repeated for every cell in the domain, and repeated until converged.

We are going to run a simple case where boundary values are provided at the top, bottom, left, and right edges. You should iterate until the maximum change in $\phi$ is below some tolerance (tol) or until a maximum number of iterations is reached (iter_max). n is the number of cells (same discretization in x and y).

I've started a script for you below. See if you can fill in the details. I've not provided all the syntax you will need to know so you may have to look some things up. A full implementation is down below, but don't peek unless you are really stuck!


In [ ]:
from math import fabs


def laplace_grid_python(n, top, bottom, left, right, tol, iter_max):
    
    # initialize
    phi = np.zeros((n+1, n+1))
    iters = 0  # number of iterations
    err_max = 1e6  # maximum error in grid (start at some arbitrary number just to enter loop)
    
    # set boundary conditions
    

    # run while loop until tolerance reached or max iterations
    while ():
        
        # reset the maximum error to something small (I suggest something like -1)
        err_max = -1.0

        # loop over all *interior* cells
        for i in range():
            for j in range():  

                # save previous point for computing error later
                phi_prev = phi[i, j]
                
                # update point
                phi[i, j] = 
                
                # update maximum error
                err_max = 

        # update iteration count
        iters += 1

    return phi, err_max, iters


# run a sample case (50 x 50 grid with bottom and left and 1.0, top and right at 0.0)
n = 50
top = 0.0
bottom = 1.0
left = 1.0
right = 0.0
tol = 1e-5
iter_max = 10000

phi, err_max, iters = laplace_grid_python(n, top, bottom, left, right, tol, iter_max)

# plot it
x = np.linspace(0, 1, n+1)
y = np.linspace(0, 1, n+1)
[X, Y] = np.meshgrid(x, y)

plt.figure()
plt.contourf(X, Y, phi, 100, cmap=plt.cm.get_cmap("YlGnBu"))
plt.colorbar()
plt.show()

In our implementation below we are adding the IPython flag %%timeit at the top. This will run the whole block some number of times and report the best time back.

Adding some blank space below just to visually separate the answer.





















In [10]:
%%timeit
from math import fabs


def laplace_grid_python(n, top, bottom, left, right, tol, iter_max):
    
    # initialize
    phi = np.zeros((n+1, n+1))
    iters = 0  # number of iterations
    err_max = 1e6  # maximum error in grid (start at some arbitrary number just to enter loop)

    # set boundary conditions
    phi[0, :] = bottom
    phi[-1, :] = top
    phi[:, 0] = left
    phi[:, -1] = right
    
    # run while loop until tolerance reached or max iterations
    while (err_max > tol and iters < iter_max):
        
        # reset the maximum error to something small (I suggest something like -1)
        err_max = -1.0

        # loop over all interior cells
        for i in range(1, n):
            for j in range(1, n):

                # save previous point
                phi_prev = phi[i, j]
                
                # update point
                phi[i, j] = (phi[i-1,j] + phi[i+1,j] + phi[i,j-1] + phi[i,j+1])/4.0
                
                # update maximum error
                err_max = max(err_max, fabs(phi[i, j] - phi_prev))

        # update iteration count
        iters += 1

    return phi, err_max, iters

# run a sample case (50 x 50 grid with bottom and left and 1.0, top and right at 0.0)
n = 50
top = 0.0
bottom = 1.0
left = 1.0
right = 0.0
tol = 1e-5
iter_max = 10000

phi, err_max, iters = laplace_grid_python(n, top, bottom, left, right, tol, iter_max)

# plot it
x = np.linspace(0, 1, n+1)
y = np.linspace(0, 1, n+1)
[X, Y] = np.meshgrid(x, y)

plt.figure()
plt.contourf(X, Y, phi, 100, cmap=plt.cm.get_cmap("YlGnBu"))
plt.colorbar()
plt.show()


1 loops, best of 3: 5.12 s per loop

This takes a while. Let's move the double for loop computation to Fortran. I've supplied a file called laplace.f90 where I've done this for you. We just need to build this as a shared library so we can call it from Python. Open a terminal (you can do this in try.jupyter.org as well). We will compile the fortran code to a shared library with f2py, (can also use standard gfortran compilation but you get more flexibility and automatic setup with f2py). In all of the below I am using an O2 optimization flag. Note the underscore in the shared library name. This is just convention. Note that if you make a mistake in importing, IPython caches your modules so you'd need to restart the Kernel. You can even do all of this <try.jupyter.org> by opening a terminal.

Using f2py

f2py -c --opt=-O2 -m _laplace laplace.f90

Using setup script. Open up a file and call it setup.py. At a minimum this is all it needs:

from numpy.distutils.core import setup, Extension
setup(
    ext_modules=[Extension('_laplace', ['laplace.f90'], extra_compile_args=['-O2'])]
)

Usually a lot more information would be added (name, license, other python packages, etc.) You can read more about setuptools later.

To build it one normally just uses build or install commands, but we will built it inplace for testing.

python setup.py build_ext --inplace

Now we can call it from Python just as a regular method. An example is shown below doing the exact same thing as before, but calling the Fortran code in laplacegridfortran. This runs over 10x faster (and could be even faster if we used ifort instead of gfortran). How much faster the code is of course depends ont he problem as you change n the difference will either become more or less significant.


In [11]:
%%timeit
from _laplace import laplacegridfortran

n = 50
top = 0.0
bottom = 1.0
left = 1.0
right = 0.0
tol = 1e-5
iter_max = 10000

phi, err_max, iters = laplacegridfortran(n, top, bottom, left, right, tol, iter_max)

# plot it
x = np.linspace(0, 1, n+1)
y = np.linspace(0, 1, n+1)
[X, Y] = np.meshgrid(x, y)

plt.figure()
plt.contourf(X, Y, phi, 100, cmap=plt.cm.get_cmap("YlGnBu"))
plt.colorbar()
plt.show()


1 loops, best of 3: 362 ms per loop

In [ ]: