pint
Have you ever wished you could carry units around with your quantities — and have the computer figure out the best units and multipliers to use?
pint
is a nince, compact library for doing just this, handling all your dimensional analysis needs. It can also detect units from strings. We can define our own units, it knows about multipliers (kilo, mega, etc), and it even works with numpy
and pandas
.
Install pint
with pip
or conda
, e.g.
pip install pint
NB If you are running this on Google Colaboratory, you must uncomment these lines (delete the initial #
) and run this first:
In [ ]:
#!pip install pint
#!pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0
To use it in its typical mode, we import the library then instantiate a UnitRegistry
object. The registry contains lots of physical units.
In [3]:
import pint
units = pint.UnitRegistry()
In [4]:
pint.__version__
Out[4]:
In [5]:
thickness = 68 * units.m
thickness
Out[5]:
In a Jupyter Notebook you see a 'pretty' version of the quantity. In the interpreter, you'll see something slightly different (the so-called repr
of the class):
>>> thickness
<Quantity(68, 'meter')>
We can get at the magnitude, the units, and the dimensionality of this quantity:
In [6]:
thickness.magnitude, thickness.units, thickness.dimensionality
Out[6]:
You can also use the following abbreviations for magnitude and units:
thickness.m, thickness.u
For printing, we can use Python's string formatting:
In [7]:
f'{thickness**2}'
Out[7]:
But pint
extends the string formatting options to include special options for Quantity
objects. The most useful option is P
for 'pretty', but there's also L
for $\LaTeX$ and H
for HTML. Adding a ~
(tilde) before the option tells pint
to use unit abbreviations instead of the full names:
In [8]:
print(f'{thickness**2:P}')
print(f'{thickness**2:~P}')
print(f'{thickness**2:~L}')
print(f'{thickness**2:~H}')
In [9]:
thickness * 2
Out[9]:
Note that you must use units when you need them:
In [10]:
thickness + 10
# This is meant to produce an error...
Let's try defining an area of $60\ \mathrm{km}^2$, then multiplying it by our thickness. To make it more like a hydrocarbon volume, I'll also multiply by net:gross n2g
, porosity phi
, and saturation sat
, all of which are dimensionless:
In [11]:
area = 60 * units.km**2
n2g = 0.5 * units.dimensionless # Optional dimensionless 'units'...
phi = 0.2 # ... but you can just do this.
sat = 0.7
volume = area * thickness * n2g * phi * sat
volume
Out[11]:
We can convert to something more compact:
In [12]:
volume.to_compact()
Out[12]:
Or be completely explicit about the units and multipliers we want:
In [13]:
volume.to('m**3') # Or use m^3
Out[13]:
The to_compact()
method can also take units, if you want to be more explicit; it applies multipliers automatically:
In [14]:
volume.to_compact('L')
Out[14]:
Oil barrels are already defined (careful, they are abbreviated as oil_bbl
not bbl
— that's a 31.5 gallon barrel, about the same as a beer barrel).
In [15]:
volume.to_compact('oil_barrel')
Out[15]:
If we use string formatting (see above), we can get pretty specific:
In [16]:
f"The volume is {volume.to_compact('oil_barrel'):~0.2fL}"
Out[16]:
pint
defines hundreads of units (here's the list), and it knows about tonnes of oil equivalent... but it doesn't know about barrels of oil equivalent (for more on conversion to BOE). So let's define a custom unit, using the USGS's conversion factor:
In [17]:
units.define('barrel_of_oil_equivalent = 6000 ft**3 = boe')
Let's suspend reality for a moment and imagine we now want to compute our gross rock volume in BOEs...
In [18]:
volume.to('boe')
Out[18]:
In [19]:
volume.to_compact('boe')
Out[19]:
In [20]:
units('2.34 km')
Out[20]:
This looks useful! Let's try something less nicely formatted.
In [21]:
units('2.34*10^3 km')
Out[21]:
In [22]:
units('-12,000.ft')
Out[22]:
In [23]:
units('3.2 m')
Out[23]:
You can also use the Quantity
constructor, like this:
>>> qty = pint.Quantity
>>> qty('2.34 km')
2.34 kilometer
But the UnitRegistry
seems to do the same things and might be more convenient.
pint
with uncertainties
Conveniently, pint
works well with uncertainties
. Maybe I'll do an X lines on that package in the future. Install it with conda
or pip
, e.g.
pip install uncertainties
In [24]:
from uncertainties import ufloat
area = ufloat(64, 5) * units.km**2 # 64 +/- 5 km**2
(thickness * area).to('Goil_bbl')
Out[24]:
In [25]:
import numpy as np
vp = np.array([2300, 2400, 2550, 3200]) * units.m/units.s
rho = np.array([2400, 2550, 2500, 2650]) * units.kg/units.m**3
z = vp * rho
z
Out[25]:
For some reason, this sometimes doesn't render properly. But we can always do this:
In [26]:
print(z)
As expected, the magnitude of this quantity is just a NumPy array:
In [27]:
z.m
Out[27]:
pint
with pandas
Note that this functionality is fairly new and is still settling down. YMMV.
To use pint
(version 0.9 and later) with pandas
(version 0.24.2 works; 0.25.0 does not work at the time of writing), we must first install pintpandas
, which must be done from source; get the code from GitHub. Here's how I do it:
cd pint-pandas
python setup.py sdist
pip install dist/Pint-Pandas-0.1.dev0.tar.gz
You could also do:
pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0
Once you have done that, the following should evaluate to True
:
In [36]:
pint._HAS_PINTPANDAS
Out[36]:
To use this integration, we pass special pint
data types to the pd.Series()
object:
In [30]:
import pandas as pd
df = pd.DataFrame({
"Vp": pd.Series(vp.m, dtype="pint[m/s]"),
"Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
"rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})
df
Out[30]:
In [31]:
import bruges as bg
df['E'] = bg.rockphysics.moduli.youngs(df.Vp, df.Vs, df.rho)
df.E
Out[31]:
We can't convert the units of a whole Series
but we can do one:
In [32]:
df.loc[0, 'E'].to('GPa')
Out[32]:
So to convert a whole series, we can use Series.apply()
:
In [33]:
df.E.apply(lambda x: x.to('GPa'))
Out[33]:
In [34]:
class UnitDataFrame(pd.DataFrame):
def _repr_html_(self):
"""New repr for Jupyter Notebook."""
html = super()._repr_html_() # Get the old repr string.
units = [''] + [f"{dtype.units:~H}" for dtype in self.dtypes]
style = "text-align: right; color: gray;"
new = f'<tr style="{style}"><th>' + "</th><th>".join(units) + "</th></tr></thead>"
return html.replace('</thead>', new)
In [35]:
df = UnitDataFrame({
"Vp": pd.Series(vp.m, dtype="pint[m/s]"),
"Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
"rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})
df
Out[35]:
Cute.
© Agile Scientific 2019, licensed CC-BY