pintHave you ever wished you could carry units around with your quantities — and have the computer figure out the best units and multipliers to use?
pint is a nince, compact library for doing just this, handling all your dimensional analysis needs. It can also detect units from strings. We can define our own units, it knows about multipliers (kilo, mega, etc), and it even works with numpy and pandas.
Install pint with pip or conda, e.g.
pip install pint
NB If you are running this on Google Colaboratory, you must uncomment these lines (delete the initial #) and run this first:
In [ ]:
    
#!pip install pint
#!pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0
    
To use it in its typical mode, we import the library then instantiate a UnitRegistry object. The registry contains lots of physical units.
In [3]:
    
import pint
units = pint.UnitRegistry()
    
In [4]:
    
pint.__version__
    
    Out[4]:
In [5]:
    
thickness = 68 * units.m
thickness
    
    Out[5]:
In a Jupyter Notebook you see a 'pretty' version of the quantity. In the interpreter, you'll see something slightly different (the so-called repr of the class):
>>> thickness
<Quantity(68, 'meter')>
We can get at the magnitude, the units, and the dimensionality of this quantity:
In [6]:
    
thickness.magnitude, thickness.units, thickness.dimensionality
    
    Out[6]:
You can also use the following abbreviations for magnitude and units:
thickness.m, thickness.u
For printing, we can use Python's string formatting:
In [7]:
    
f'{thickness**2}'
    
    Out[7]:
But pint extends the string formatting options to include special options for Quantity objects. The most useful option is P for 'pretty', but there's also L for $\LaTeX$ and H for HTML. Adding a ~ (tilde) before the option tells pint to use unit abbreviations instead of the full names:
In [8]:
    
print(f'{thickness**2:P}')
print(f'{thickness**2:~P}')
print(f'{thickness**2:~L}')
print(f'{thickness**2:~H}')
    
    
In [9]:
    
thickness * 2
    
    Out[9]:
Note that you must use units when you need them:
In [10]:
    
thickness + 10
# This is meant to produce an error...
    
    
Let's try defining an area of $60\ \mathrm{km}^2$, then multiplying it by our thickness. To make it more like a hydrocarbon volume, I'll also multiply by net:gross n2g, porosity phi, and saturation sat, all of which are dimensionless:
In [11]:
    
area = 60 * units.km**2
n2g = 0.5 * units.dimensionless  # Optional dimensionless 'units'...
phi = 0.2                        # ... but you can just do this.
sat = 0.7  
volume = area * thickness * n2g * phi * sat
volume
    
    Out[11]:
We can convert to something more compact:
In [12]:
    
volume.to_compact()
    
    Out[12]:
Or be completely explicit about the units and multipliers we want:
In [13]:
    
volume.to('m**3')  # Or use m^3
    
    Out[13]:
The to_compact() method can also take units, if you want to be more explicit; it applies multipliers automatically:
In [14]:
    
volume.to_compact('L')
    
    Out[14]:
Oil barrels are already defined (careful, they are abbreviated as oil_bbl not bbl — that's a 31.5 gallon barrel, about the same as a beer barrel).
In [15]:
    
volume.to_compact('oil_barrel')
    
    Out[15]:
If we use string formatting (see above), we can get pretty specific:
In [16]:
    
f"The volume is {volume.to_compact('oil_barrel'):~0.2fL}"
    
    Out[16]:
pint defines hundreads of units (here's the list), and it knows about tonnes of oil equivalent... but it doesn't know about barrels of oil equivalent (for more on conversion to BOE). So let's define a custom unit, using the USGS's conversion factor:
In [17]:
    
units.define('barrel_of_oil_equivalent = 6000 ft**3 = boe')
    
Let's suspend reality for a moment and imagine we now want to compute our gross rock volume in BOEs...
In [18]:
    
volume.to('boe')
    
    Out[18]:
In [19]:
    
volume.to_compact('boe')
    
    Out[19]:
In [20]:
    
units('2.34 km')
    
    Out[20]:
This looks useful! Let's try something less nicely formatted.
In [21]:
    
units('2.34*10^3 km')
    
    Out[21]:
In [22]:
    
units('-12,000.ft')
    
    Out[22]:
In [23]:
    
units('3.2 m')
    
    Out[23]:
You can also use the Quantity constructor, like this:
>>> qty = pint.Quantity
>>> qty('2.34 km')
2.34 kilometer
But the UnitRegistry seems to do the same things and might be more convenient.
pint with uncertaintiesConveniently, pint works well with uncertainties. Maybe I'll do an X lines on that package in the future. Install it with conda or pip, e.g.
pip install uncertainties
In [24]:
    
from uncertainties import ufloat
area = ufloat(64, 5) * units.km**2  # 64 +/- 5 km**2
(thickness * area).to('Goil_bbl')
    
    Out[24]:
In [25]:
    
import numpy as np
vp = np.array([2300, 2400, 2550, 3200]) * units.m/units.s
rho = np.array([2400, 2550, 2500, 2650]) * units.kg/units.m**3
z = vp * rho
z
    
    Out[25]:
For some reason, this sometimes doesn't render properly. But we can always do this:
In [26]:
    
print(z)
    
    
As expected, the magnitude of this quantity is just a NumPy array:
In [27]:
    
z.m
    
    Out[27]:
pint with pandasNote that this functionality is fairly new and is still settling down. YMMV.
To use pint (version 0.9 and later) with pandas (version 0.24.2 works; 0.25.0 does not work at the time of writing), we must first install pintpandas, which must be done from source; get the code from GitHub. Here's how I do it:
cd pint-pandas
python setup.py sdist
pip install dist/Pint-Pandas-0.1.dev0.tar.gz
You could also do:
pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0
Once you have done that, the following should evaluate to True:
In [36]:
    
pint._HAS_PINTPANDAS
    
    Out[36]:
To use this integration, we pass special pint data types to the pd.Series() object:
In [30]:
    
import pandas as pd
df = pd.DataFrame({
    "Vp": pd.Series(vp.m, dtype="pint[m/s]"),
    "Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
    "rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})
df
    
    Out[30]:
In [31]:
    
import bruges as bg
df['E'] = bg.rockphysics.moduli.youngs(df.Vp, df.Vs, df.rho)
df.E
    
    Out[31]:
We can't convert the units of a whole Series but we can do one:
In [32]:
    
df.loc[0, 'E'].to('GPa')
    
    Out[32]:
So to convert a whole series, we can use Series.apply():
In [33]:
    
df.E.apply(lambda x: x.to('GPa'))
    
    Out[33]:
In [34]:
    
class UnitDataFrame(pd.DataFrame):
    def _repr_html_(self):
        """New repr for Jupyter Notebook."""
        html = super()._repr_html_()  # Get the old repr string.
        units = [''] + [f"{dtype.units:~H}" for dtype in self.dtypes]
        style = "text-align: right; color: gray;"
        new = f'<tr style="{style}"><th>' + "</th><th>".join(units) + "</th></tr></thead>"
        return html.replace('</thead>', new)
    
In [35]:
    
df = UnitDataFrame({
    "Vp": pd.Series(vp.m, dtype="pint[m/s]"),
    "Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
    "rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})
df
    
    Out[35]:
Cute.
© Agile Scientific 2019, licensed CC-BY