Python rehearsal

DS Data manipulation, analysis and visualisation in Python
December, 2019

© 2016, Joris Van den Bossche and Stijn Van Hoey (mailto:jorisvandenbossche@gmail.com, mailto:stijnvanhoey@gmail.com). Licensed under CC BY 4.0 Creative Commons


I measure air pressure


In [17]:
pressure_hPa = 1010 # hPa
REMEMBER: Use meaningful variable names

I'm measuring at sea level, what would be the air pressure of this measured value on other altitudes?

I'm curious what the equivalent pressure would be on other alitudes...

The barometric formula, sometimes called the exponential atmosphere or isothermal atmosphere, is a formula used to model how the pressure (or density) of the air changes with altitude. The pressure drops approximately by 11.3 Pa per meter in first 1000 meters above sea level.

$$P=P_0 \cdot \exp \left[\frac{-g \cdot M \cdot h}{R \cdot T}\right]$$

see https://www.math24.net/barometric-formula/ or https://en.wikipedia.org/wiki/Atmospheric_pressure

where:

  • $T$ = standard temperature, 288.15 (K)
  • $R$ = universal gas constant, 8.3144598, (J/mol/K)
  • $g$ = gravitational acceleration, 9.81 (m/s$^2$)
  • $M$ = molar mass of Earth's air, 0.02896 (kg/mol)

and:

  • $P_0$ = sea level pressure (hPa)
  • $h$ = height above sea level (m)

Let's implement this...

To calculate the formula, I need the exponential operator. Pure Python provide a number of mathematical functions, e.g. https://docs.python.org/3.7/library/math.html#math.exp within the math library


In [18]:
import math

In [19]:
# ...modules and libraries...
DON'T: from os import *. Just don't!

In [20]:
standard_temperature = 288.15
gas_constant = 8.31446
gravit_acc = 9.81
molar_mass_earth = 0.02896
EXERCISE:
  • Calculate the equivalent air pressure at the altitude of 2500 m above sea level for our measured value of pressure_hPa (1010 hPa)

In [21]:
height = 2500
pressure_hPa * math.exp(-gravit_acc * molar_mass_earth* height/(gas_constant*standard_temperature))


Out[21]:
750.885560345225

In [22]:
# ...function/definition for barometric_formula...

In [23]:
def barometric_formula(pressure_sea_level, height=2500):
    """Apply barometric formula
    
    Apply the barometric formula to calculate the air pressure on a given height
    
    Parameters
    ----------
    pressure_sea_level : float
        pressure, measured as sea level
    height : float
        height above sea level (m)
    
    Notes
    ------
    see https://www.math24.net/barometric-formula/ or 
    https://en.wikipedia.org/wiki/Atmospheric_pressure
    """
    standard_temperature = 288.15
    gas_constant = 8.3144598
    gravit_acc = 9.81
    molar_mass_earth = 0.02896
    
    pressure_altitude = pressure_sea_level * math.exp(-gravit_acc * molar_mass_earth* height/(gas_constant*standard_temperature))
    return pressure_altitude

In [24]:
barometric_formula(pressure_hPa, 2000)


Out[24]:
796.752205935891

In [25]:
barometric_formula(pressure_hPa)


Out[25]:
750.8855549906549

In [26]:
# ...formula not valid above 11000m... 
# barometric_formula(pressure_hPa, 12000)

In [27]:
def barometric_formula(pressure_sea_level, height=2500):
    """Apply barometric formula
    
    Apply the barometric formula to calculate the air pressure on a given height
    
    Parameters
    ----------
    pressure_sea_level : float
        pressure, measured as sea level
    height : float
        height above sea level (m)
    
    Notes
    ------
    see https://www.math24.net/barometric-formula/ or 
    https://en.wikipedia.org/wiki/Atmospheric_pressure
    """
    if height > 11000:
        raise Exception("Barometric formula only valid for heights lower than 11000m above sea level")
    
    standard_temperature = 288.15
    gas_constant = 8.3144598
    gravit_acc = 9.81
    molar_mass_earth = 0.02896
    
    pressure_altitude = pressure_sea_level * math.exp(-gravit_acc * molar_mass_earth* height/(gas_constant*standard_temperature))
    return pressure_altitude

In [28]:
# ...combining logical statements...

In [29]:
height > 11000 or pressure_hPa < 9000


Out[29]:
True

In [30]:
# ...load function from file...

Instead of having the functions in a notebook, importing the function from a file can be done as importing a function from an installed package. Save the function barometric_formula in a file called barometric_formula.py and add the required import statement import math on top of the file. Next, run the following cell:


In [3]:
from barometric_formula import barometric_formula
REMEMBER:
  • Write functions to prevent copy-pasting of code and maximize reuse
  • Add documentation to functions for your future self
  • Named arguments provide default values
  • Import functions from a file just as other modules

I measure air pressure multiple times

We can store these in a Python list:


In [31]:
pressures_hPa = [1013, 1003, 1010, 1020, 1032, 993, 989, 1018, 889, 1001]

In [32]:
# ...check methods of lists... append vs insert
Notice:
  • A list is a general container, so can exist of mixed data types as well.

In [34]:
# ...list is a container...

I want to apply my function to each of these measurements

I want to calculate the barometric formula for each of these measured values.


In [35]:
# ...for loop... dummy example
EXERCISE:
  • Write a for loop that prints the adjusted value for altitude 3000m for each of the pressures in pressures_hPa

In [36]:
for pressure in pressures_hPa:
    print(barometric_formula(pressure, 3000))


709.761268850131
702.7547410233775
707.659310502105
714.6658383288585
723.0736717209627
695.748213196624
692.9456020659226
713.2645327635078
622.8803237983875
701.3534354580269

In [37]:
# ...list comprehensions...
EXERCISE:
  • Write a for loop as a list comprehension to calculate the adjusted value for altitude 3000m for each of the pressures in pressures_hPa and store these values in a new variable pressures_hPa_adjusted

In [38]:
pressures_hPa_adjusted = [barometric_formula(pressure, 3000) for pressure in pressures_hPa]
pressures_hPa_adjusted


Out[38]:
[709.761268850131,
 702.7547410233775,
 707.659310502105,
 714.6658383288585,
 723.0736717209627,
 695.748213196624,
 692.9456020659226,
 713.2645327635078,
 622.8803237983875,
 701.3534354580269]

The power of numpy


In [39]:
import numpy as np

In [40]:
pressures_hPa = [1013, 1003, 1010, 1020, 1032, 993, 989, 1018, 889, 1001]

In [41]:
np_pressures_hPa = np.array([1013, 1003, 1010, 1020, 1032, 993, 989, 1018, 889, 1001])

In [42]:
# ...slicing/subselecting is similar...

In [43]:
print(np_pressures_hPa[0], pressures_hPa[0])


1013 1013

In [ ]:

REMEMBER:
  • [] for accessing elements
  • [start:end:step]

In [48]:
# ...original function using numpy array instead of list... do both

In [49]:
np_pressures_hPa * math.exp(-gravit_acc * molar_mass_earth* height/(gas_constant*standard_temperature))


Out[49]:
array([753.11591349, 745.681403  , 750.88556035, 758.32007084,
       767.24148344, 738.2468925 , 735.2730883 , 756.83316874,
       660.92798331, 744.1945009 ])
REMEMBER: The operations do work on all elements of the array at the same time, you don't need a `for` loop

It is also a matter of calculation speed:


In [50]:
lots_of_pressures = np.random.uniform(990, 1040, 1000)

In [51]:
%timeit [barometric_formula(pressure, 3000) for pressure in list(lots_of_pressures)]


510 µs ± 3.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [52]:
%timeit lots_of_pressures * np.exp(-gravit_acc * molar_mass_earth* height/(gas_constant*standard_temperature))


2.91 µs ± 65 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
REMEMBER: for calculations, numpy outperforms python

Boolean indexing and filtering (!)


In [53]:
np_pressures_hPa


Out[53]:
array([1013, 1003, 1010, 1020, 1032,  993,  989, 1018,  889, 1001])

In [54]:
np_pressures_hPa > 1000


Out[54]:
array([ True,  True,  True,  True,  True, False, False,  True, False,
        True])

You can use this as a filter to select elements of an array:


In [55]:
boolean_mask = np_pressures_hPa > 1000
np_pressures_hPa[boolean_mask]


Out[55]:
array([1013, 1003, 1010, 1020, 1032, 1018, 1001])

or, also to change the values in the array corresponding to these conditions:


In [56]:
boolean_mask = np_pressures_hPa < 900
np_pressures_hPa[boolean_mask] = 900
np_pressures_hPa


Out[56]:
array([1013, 1003, 1010, 1020, 1032,  993,  989, 1018,  900, 1001])

Intermezzo: Exercises boolean indexing:


In [57]:
AR = np.random.randint(0, 20, 15)
AR


Out[57]:
array([ 6, 14, 14,  4, 12, 11,  4, 19, 15, 14,  3,  5, 17, 17,  8])
EXERCISE: Count the number of values in AR that are larger than 10 _Tip:_ You can count with True = 1 and False = 0 and take the sum of these values

In [58]:
sum(AR > 10)


Out[58]:
9
EXERCISE: Change all even numbers of `AR` into zero-values.

In [59]:
AR[AR%2 == 0] = 0
AR


Out[59]:
array([ 0,  0,  0,  0,  0, 11,  0, 19, 15,  0,  3,  5, 17, 17,  0])
EXERCISE: Change all even positions of matrix AR into the value 30

In [60]:
AR[1::2] = 30
AR


Out[60]:
array([ 0, 30,  0, 30,  0, 30,  0, 30, 15, 30,  3, 30, 17, 30,  0])
EXERCISE: Select all values above the 75th percentile of the following array AR2 ad take the square root of these values _Tip_: numpy provides a function `percentile` to calculate a given percentile

In [61]:
AR2 = np.random.random(10)
AR2


Out[61]:
array([0.43809786, 0.0854234 , 0.94759354, 0.47935186, 0.6412736 ,
       0.57084005, 0.32161626, 0.81288202, 0.44000385, 0.08868051])

In [62]:
np.sqrt(AR2[AR2 > np.percentile(AR2, 75)])


Out[62]:
array([0.97344416, 0.8007956 , 0.9015997 ])
EXERCISE: Convert all values -99 of the array AR3 into Nan-values _Tip_: that Nan values can be provided in float arrays as `np.nan` and that numpy provides a specialized function to compare float values, i.e. `np.isclose()`

In [63]:
AR3 = np.array([-99., 2., 3., 6., 8, -99., 7., 5., 6., -99.])

In [64]:
AR3[np.isclose(AR3, -99)] = np.nan
AR3


Out[64]:
array([nan,  2.,  3.,  6.,  8., nan,  7.,  5.,  6., nan])

I also have measurement locations


In [65]:
location = 'Ghent - Sterre'

In [66]:
# ...check methods of strings... split, upper,...

In [ ]:


In [67]:
locations = ['Ghent - Sterre', 'Ghent - Coupure', 'Ghent - Blandijn', 
             'Ghent - Korenlei', 'Ghent - Kouter', 'Ghent - Coupure',
             'Antwerp - Groenplaats', 'Brussels- Grand place', 
             'Antwerp - Justitipaleis', 'Brussels - Tour & taxis']
EXERCISE: Use a list comprehension to convert all locations to lower case. _Tip:_ check the available methods of lists by writing: `location.` + TAB button

In [68]:
[location.lower() for location in locations]


Out[68]:
['ghent - sterre',
 'ghent - coupure',
 'ghent - blandijn',
 'ghent - korenlei',
 'ghent - kouter',
 'ghent - coupure',
 'antwerp - groenplaats',
 'brussels- grand place',
 'antwerp - justitipaleis',
 'brussels - tour & taxis']

I also measure temperature


In [69]:
pressures_hPa = [1013, 1003, 1010, 1020, 1032, 993, 989, 1018, 889, 1001]
temperature_degree = [23, 20, 17, 8, 12, 5, 16, 22, -2, 16]
locations = ['Ghent - Sterre', 'Ghent - Coupure', 'Ghent - Blandijn', 
             'Ghent - Korenlei', 'Ghent - Kouter', 'Ghent - Coupure',
             'Antwerp - Groenplaats', 'Brussels- Grand place', 
             'Antwerp - Justitipaleis', 'Brussels - Tour & taxis']

Python dictionaries are a convenient way to store multiple types of data together, to not have too much different variables:


In [70]:
measurement = {}
measurement['pressure_hPa'] = 1010
measurement['temperature'] = 23

In [71]:
measurement


Out[71]:
{'pressure_hPa': 1010, 'temperature': 23}

In [72]:
# ...select on name, iterate over keys or items...

In [ ]:


In [73]:
measurements = {'pressure_hPa': pressures_hPa,
                'temperature_degree': temperature_degree,
                'location': locations}

In [74]:
measurements


Out[74]:
{'pressure_hPa': [1013, 1003, 1010, 1020, 1032, 993, 989, 1018, 889, 1001],
 'temperature_degree': [23, 20, 17, 8, 12, 5, 16, 22, -2, 16],
 'location': ['Ghent - Sterre',
  'Ghent - Coupure',
  'Ghent - Blandijn',
  'Ghent - Korenlei',
  'Ghent - Kouter',
  'Ghent - Coupure',
  'Antwerp - Groenplaats',
  'Brussels- Grand place',
  'Antwerp - Justitipaleis',
  'Brussels - Tour & taxis']}

But: I want to apply my barometric function to measurements taken in Ghent when the temperature was below 10 degrees...


In [75]:
for idx, pressure in enumerate(measurements['pressure_hPa']):
    if measurements['location'][idx].startswith("Ghent") and \
        measurements['temperature_degree'][idx]< 10:
        print(barometric_formula(pressure, 3000))


714.6658383288585
695.748213196624

when a table would be more appropriate... Pandas!


In [76]:
import pandas as pd

In [77]:
measurements = pd.DataFrame(measurements)
measurements


Out[77]:
pressure_hPa temperature_degree location
0 1013 23 Ghent - Sterre
1 1003 20 Ghent - Coupure
2 1010 17 Ghent - Blandijn
3 1020 8 Ghent - Korenlei
4 1032 12 Ghent - Kouter
5 993 5 Ghent - Coupure
6 989 16 Antwerp - Groenplaats
7 1018 22 Brussels- Grand place
8 889 -2 Antwerp - Justitipaleis
9 1001 16 Brussels - Tour & taxis

In [78]:
barometric_formula(measurements[(measurements["location"].str.contains("Ghent")) & 
                  (measurements["temperature_degree"] < 10)]["pressure_hPa"], 3000)


Out[78]:
3    714.665838
5    695.748213
Name: pressure_hPa, dtype: float64
REMEMBER: We can combine the speed of numpy with the convenience of dictionaries and much more!