X LINES OF PYTHON

Physical units with pint

Have you ever wished you could carry units around with your quantities — and have the computer figure out the best units and multipliers to use?

pint is a nince, compact library for doing just this, handling all your dimensional analysis needs. It can also detect units from strings. We can define our own units, it knows about multipliers (kilo, mega, etc), and it even works with numpy and pandas.

Install pint with pip or conda, e.g.

pip install pint

NB If you are running this on Google Colaboratory, you must uncomment these lines (delete the initial #) and run this first:


In [ ]:
#!pip install pint
#!pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0

To use it in its typical mode, we import the library then instantiate a UnitRegistry object. The registry contains lots of physical units.


In [3]:
import pint

units = pint.UnitRegistry()

In [4]:
pint.__version__


Out[4]:
'0.9'

Attaching and printing units


In [5]:
thickness = 68 * units.m
thickness


Out[5]:
68 meter

In a Jupyter Notebook you see a 'pretty' version of the quantity. In the interpreter, you'll see something slightly different (the so-called repr of the class):

>>> thickness
<Quantity(68, 'meter')>

We can get at the magnitude, the units, and the dimensionality of this quantity:


In [6]:
thickness.magnitude, thickness.units, thickness.dimensionality


Out[6]:
(68, <Unit('meter')>, <UnitsContainer({'[length]': 1.0})>)

You can also use the following abbreviations for magnitude and units:

thickness.m, thickness.u

For printing, we can use Python's string formatting:


In [7]:
f'{thickness**2}'


Out[7]:
'4624 meter ** 2'

But pint extends the string formatting options to include special options for Quantity objects. The most useful option is P for 'pretty', but there's also L for $\LaTeX$ and H for HTML. Adding a ~ (tilde) before the option tells pint to use unit abbreviations instead of the full names:


In [8]:
print(f'{thickness**2:P}')
print(f'{thickness**2:~P}')
print(f'{thickness**2:~L}')
print(f'{thickness**2:~H}')


4624 meter²
4624 m²
4624\ \mathrm{m}^{2}
4624 m<sup>2</sup>

Doing maths

If we multiply by a scalar, pint produces the result you'd expect:


In [9]:
thickness * 2


Out[9]:
136 meter

Note that you must use units when you need them:


In [10]:
thickness + 10

# This is meant to produce an error...


---------------------------------------------------------------------------
DimensionalityError                       Traceback (most recent call last)
<ipython-input-10-ecd15f6b9f4c> in <module>
----> 1 thickness + 10
      2 
      3 # This is meant to produce an error...

~/anaconda3/envs/xlines/lib/python3.7/site-packages/pint/quantity.py in __add__(self, other)
    752             return self.to_timedelta() + other
    753         else:
--> 754             return self._add_sub(other, operator.add)
    755 
    756     __radd__ = __add__

~/anaconda3/envs/xlines/lib/python3.7/site-packages/pint/quantity.py in wrapped(self, *args, **kwargs)
     73         elif isinstance(other, list) and isinstance(other[0], type(self)):
     74             return NotImplemented
---> 75         result = f(self, *args, **kwargs)
     76         return result
     77     return wrapped

~/anaconda3/envs/xlines/lib/python3.7/site-packages/pint/quantity.py in _add_sub(self, other, op)
    663                                _to_magnitude(other, self.force_ndarray))
    664             else:
--> 665                 raise DimensionalityError(self._units, 'dimensionless')
    666             return self.__class__(magnitude, units)
    667 

DimensionalityError: Cannot convert from 'meter' to 'dimensionless'

Let's try defining an area of $60\ \mathrm{km}^2$, then multiplying it by our thickness. To make it more like a hydrocarbon volume, I'll also multiply by net:gross n2g, porosity phi, and saturation sat, all of which are dimensionless:


In [11]:
area = 60 * units.km**2
n2g = 0.5 * units.dimensionless  # Optional dimensionless 'units'...
phi = 0.2                        # ... but you can just do this.
sat = 0.7  

volume = area * thickness * n2g * phi * sat
volume


Out[11]:
285.59999999999997 kilometer2 meter

We can convert to something more compact:


In [12]:
volume.to_compact()


Out[12]:
285599999.99999994 meter3

Or be completely explicit about the units and multipliers we want:


In [13]:
volume.to('m**3')  # Or use m^3


Out[13]:
285599999.99999994 meter3

The to_compact() method can also take units, if you want to be more explicit; it applies multipliers automatically:


In [14]:
volume.to_compact('L')


Out[14]:
285.6 gigaliter

Oil barrels are already defined (careful, they are abbreviated as oil_bbl not bbl — that's a 31.5 gallon barrel, about the same as a beer barrel).


In [15]:
volume.to_compact('oil_barrel')


Out[15]:
1.7963699560354092 gigaoil_barrel

If we use string formatting (see above), we can get pretty specific:


In [16]:
f"The volume is {volume.to_compact('oil_barrel'):~0.2fL}"


Out[16]:
'The volume is 1.80\\ \\mathrm{Goil_bbl}'

Defining new units

pint defines hundreads of units (here's the list), and it knows about tonnes of oil equivalent... but it doesn't know about barrels of oil equivalent (for more on conversion to BOE). So let's define a custom unit, using the USGS's conversion factor:


In [17]:
units.define('barrel_of_oil_equivalent = 6000 ft**3 = boe')

Let's suspend reality for a moment and imagine we now want to compute our gross rock volume in BOEs...


In [18]:
volume.to('boe')


Out[18]:
1680978.135942857 barrel_of_oil_equivalent

In [19]:
volume.to_compact('boe')


Out[19]:
1.680978135942857 megabarrel_of_oil_equivalent

Getting units from strings

pint can also parse strings and attempt to convert them to Quantity instances:


In [20]:
units('2.34 km')


Out[20]:
2.34 kilometer

This looks useful! Let's try something less nicely formatted.


In [21]:
units('2.34*10^3 km')


Out[21]:
2340.0 kilometer

In [22]:
units('-12,000.ft')


Out[22]:
-12000.0 foot

In [23]:
units('3.2 m')


Out[23]:
3.2 meter

You can also use the Quantity constructor, like this:

>>> qty = pint.Quantity
>>> qty('2.34 km')
2.34 kilometer

But the UnitRegistry seems to do the same things and might be more convenient.

pint with uncertainties

Conveniently, pint works well with uncertainties. Maybe I'll do an X lines on that package in the future. Install it with conda or pip, e.g.

pip install uncertainties

In [24]:
from uncertainties import ufloat

area = ufloat(64, 5) * units.km**2  # 64 +/- 5 km**2
(thickness * area).to('Goil_bbl')


Out[24]:
27.4+/-2.1 gigaoil_barrel

pint with numpy

pint works fine with NumPy arrays:


In [25]:
import numpy as np

vp = np.array([2300, 2400, 2550, 3200]) * units.m/units.s
rho = np.array([2400, 2550, 2500, 2650]) * units.kg/units.m**3

z = vp * rho
z


Out[25]:
\[\begin{pmatrix}5520000.0 & 6120000.0 & 6375000.0 & 8480000.0\end{pmatrix} kilogram/(meter2 second)\]

For some reason, this sometimes doesn't render properly. But we can always do this:


In [26]:
print(z)


[5520000. 6120000. 6375000. 8480000.] kilogram / meter ** 2 / second

As expected, the magnitude of this quantity is just a NumPy array:


In [27]:
z.m


Out[27]:
array([5520000., 6120000., 6375000., 8480000.])

pint with pandas

Note that this functionality is fairly new and is still settling down. YMMV.

To use pint (version 0.9 and later) with pandas (version 0.24.2 works; 0.25.0 does not work at the time of writing), we must first install pintpandas, which must be done from source; get the code from GitHub. Here's how I do it:

cd pint-pandas
python setup.py sdist
pip install dist/Pint-Pandas-0.1.dev0.tar.gz

You could also do:

pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0

Once you have done that, the following should evaluate to True:


In [36]:
pint._HAS_PINTPANDAS


Out[36]:
True

To use this integration, we pass special pint data types to the pd.Series() object:


In [30]:
import pandas as pd

df = pd.DataFrame({
    "Vp": pd.Series(vp.m, dtype="pint[m/s]"),
    "Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
    "rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})

df


Out[30]:
Vp Vs rho
0 2300.0 1200 2400.0
1 2400.0 1200 2550.0
2 2550.0 1250 2500.0
3 3200.0 1300 2650.0

In [31]:
import bruges as bg

df['E'] = bg.rockphysics.moduli.youngs(df.Vp, df.Vs, df.rho)

df.E


Out[31]:
0     9075366233.766233
1          9792000000.0
2     10483220521.25506
3    12550276023.391813
Name: E, dtype: pint[kilogram / meter / second ** 2]

We can't convert the units of a whole Series but we can do one:


In [32]:
df.loc[0, 'E'].to('GPa')


Out[32]:
9.075366233766236 gigapascal

So to convert a whole series, we can use Series.apply():


In [33]:
df.E.apply(lambda x: x.to('GPa'))


Out[33]:
0     9.075366233766236 gigapascal
1     9.792000000000003 gigapascal
2    10.483220521255063 gigapascal
3    12.550276023391817 gigapascal
Name: E, dtype: object

Bonus: dataframe display with units

We could subclass dataframes to tweak their _repr_html_() method, which would allow us to make units show up in the Notebook representation of the dataframe...


In [34]:
class UnitDataFrame(pd.DataFrame):
    def _repr_html_(self):
        """New repr for Jupyter Notebook."""
        html = super()._repr_html_()  # Get the old repr string.
        units = [''] + [f"{dtype.units:~H}" for dtype in self.dtypes]
        style = "text-align: right; color: gray;"
        new = f'<tr style="{style}"><th>' + "</th><th>".join(units) + "</th></tr></thead>"
        return html.replace('</thead>', new)

In [35]:
df = UnitDataFrame({
    "Vp": pd.Series(vp.m, dtype="pint[m/s]"),
    "Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
    "rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})

df


Out[35]:
Vp Vs rho
m/sm/skg/m3
0 2300.0 1200 2400.0
1 2400.0 1200 2550.0
2 2550.0 1250 2500.0
3 3200.0 1300 2650.0

Cute.


© Agile Scientific 2019, licensed CC-BY