x lines of Python

Reading and writing LAS files

This notebook goes with the Agile blog post of 23 October.

Set up a conda environment with:

conda create -n welly python=3.6 matplotlib=2.0 scipy pandas

You'll need welly in your environment:

conda install tqdm  # Should happen automatically but doesn't
pip install welly

This will also install the latest versions of striplog and lasio.


In [1]:
import welly

In [2]:
ls ../data/*.LAS


../data/P-129.LAS

1. Load the LAS file with lasio


In [3]:
import lasio

l = lasio.read('../data/P-129.LAS')  # Line 1.

That's it! But the object itself doesn't tell us much — it's really just a container:


In [4]:
l


Out[4]:
<lasio.las.LASFile at 0x1809dd3b70>

2. Look at the WELL section of the header


In [5]:
l.header['Well']  # Line 2.


Out[5]:
[HeaderItem(mnemonic=STRT, unit=M, value=1.0668, descr=START DEPTH, original_mnemonic=STRT),
 HeaderItem(mnemonic=STOP, unit=M, value=1939.1376, descr=STOP DEPTH, original_mnemonic=STOP),
 HeaderItem(mnemonic=STEP, unit=M, value=0.1524, descr=STEP, original_mnemonic=STEP),
 HeaderItem(mnemonic=NULL, unit=, value=-999.25, descr=NULL VALUE, original_mnemonic=NULL),
 HeaderItem(mnemonic=COMP, unit=, value=Elmworth Energy Corporation, descr=COMPANY, original_mnemonic=COMP),
 HeaderItem(mnemonic=WELL, unit=, value=Kennetcook #2, descr=WELL, original_mnemonic=WELL),
 HeaderItem(mnemonic=FLD, unit=, value=Windsor Block, descr=FIELD, original_mnemonic=FLD),
 HeaderItem(mnemonic=LOC, unit=, value=Lat = 45* 12' 34.237" N, descr=LOCATION, original_mnemonic=LOC),
 HeaderItem(mnemonic=PROV, unit=, value=Nova Scotia, descr=PROVINCE, original_mnemonic=PROV),
 HeaderItem(mnemonic=UWI, unit=, value=Long = 63* 45'24.460  W, descr=UNIQUE WELL ID, original_mnemonic=UWI),
 HeaderItem(mnemonic=LIC, unit=, value=P-129, descr=LICENSE NUMBER, original_mnemonic=LIC),
 HeaderItem(mnemonic=CTRY, unit=, value=CA, descr=COUNTRY (WWW code), original_mnemonic=CTRY),
 HeaderItem(mnemonic=DATE, unit=, value=10-Oct-2007, descr=LOG DATE {DD-MMM-YYYY}, original_mnemonic=DATE),
 HeaderItem(mnemonic=SRVC, unit=, value=Schlumberger, descr=SERVICE COMPANY, original_mnemonic=SRVC),
 HeaderItem(mnemonic=LATI, unit=DEG, value=, descr=LATITUDE, original_mnemonic=LATI),
 HeaderItem(mnemonic=LONG, unit=DEG, value=, descr=LONGITUDE, original_mnemonic=LONG),
 HeaderItem(mnemonic=GDAT, unit=, value=, descr=GeoDetic Datum, original_mnemonic=GDAT),
 HeaderItem(mnemonic=SECT, unit=, value=45.20 Deg N, descr=Section, original_mnemonic=SECT),
 HeaderItem(mnemonic=RANG, unit=, value=PD 176, descr=Range, original_mnemonic=RANG),
 HeaderItem(mnemonic=TOWN, unit=, value=63.75 Deg W, descr=Township, original_mnemonic=TOWN)]

You can go in and find the KB if you know what to look for:


In [6]:
l.header['Parameter']['EKB']


Out[6]:
HeaderItem(mnemonic=EKB, unit=M, value=94.8, descr=Elevation of Kelly Bushing, original_mnemonic=EKB)

3. Look at the curve data

The curves are all present one big NumPy array:


In [7]:
l.data


Out[7]:
array([[  1.06680000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.21920000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.37160000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       ..., 
       [  1.93883280e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.30000000e+01],
       [  1.93898520e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.39375000e+01],
       [  1.93913760e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.42500000e+01]])

Or we can go after a single curve object:


In [8]:
l.curves.GR  # Line 3.


Out[8]:
CurveItem(mnemonic=GR, unit=gAPI, value=, descr=Gamma-Ray, original_mnemonic=GR, data.shape=(12718,))

And there's a shortcut to its data:


In [9]:
l['GR']  # Line 4.


Out[9]:
array([ 46.69865036,  46.69865036,  46.69865036, ...,  92.24622345,
        92.24622345,  92.24622345])

...so it's easy to make a plot against depth:


In [10]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.figure(figsize=(15,3))
plt.plot(l['DEPT'], l['GR'])
plt.show()


4. Inspect the curves as a pandas dataframe


In [11]:
l.df().head()  # Line 5.


Out[11]:
CALI HCAL PEF DT DTS DPHI_SAN DPHI_LIM DPHI_DOL NPHI_SAN NPHI_LIM ... RLA1 RLA2 RXOZ RXO_HRLT RT_HRLT RM_HRLT DRHO RHOB GR SP
DEPT
1.0668 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.2192 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.3716 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.5240 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.6764 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125

5 rows × 24 columns

5. Load the LAS file with welly


In [12]:
from welly import Well

w = Well.from_las('../data/P-129.LAS')  # Line 6.

welly Wells know how to display some basics:


In [13]:
w


Out[13]:
Kennetcook #2
Long = 63* 45'24.460 W
td1935.0
crsCRS({})
locationLat = 45* 12' 34.237" N
countryCA
provinceNova Scotia
section45.20 Deg N
rangePD 176
township63.75 Deg W
kb94.8
gl90.3
tdd1935.0
tdl1935.0
dataCALI, DPHI_DOL, DPHI_LIM, DPHI_SAN, DRHO, DT, DTS, GR, HCAL, NPHI_DOL, NPHI_LIM, NPHI_SAN, PEF, RHOB, RLA1, RLA2, RLA3, RLA4, RLA5, RM_HRLT, RT_HRLT, RXOZ, RXO_HRLT, SP

And the Well object also has lasio's access to a pandas DataFrame:


In [14]:
w.df().head()


Out[14]:
CALI HCAL PEF DT DTS DPHI_SAN DPHI_LIM DPHI_DOL NPHI_SAN NPHI_LIM ... RLA1 RLA2 RXOZ RXO_HRLT RT_HRLT RM_HRLT DRHO RHOB GR SP
DEPT
1.0668 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.2192 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.3716 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.5240 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125
1.6764 2.443815 4.391285 3.5864 NaN NaN 0.15748 0.19844 0.2591 0.4651 0.33647 ... 0.0321 0.02794 0.05761 0.02558 0.02558 0.05501 0.194233 2.39015 46.69865 120.125

5 rows × 24 columns

6. Look at welly's Curve object

Like the Well, a Curve object can report a bit about itself:


In [15]:
gr = w.data['GR']  # Line 7.
gr


Out[15]:
GR [gAPI]
1.0668 : 1939.2900 : 0.1524
run1
null-999.25
service_companySchlumberger
date10-Oct-2007
code
descriptionGamma-Ray
Stats
samples (NaNs)12718 (0)
min mean max3.89 78.986 267.94
DepthValue
1.066846.6987
1.219246.6987
1.371646.6987
1938.832892.2462
1938.985292.2462
1939.137692.2462

One important thing about Curves is that each one knows its own depths — they are stored as a property called basis. (It's not actually stored, but computed on demand from the start depth, the sample interval (which must be constant for the whole curve) and the number of samples in the object.)


In [16]:
gr.basis


Out[16]:
array([  1.06680000e+00,   1.21920000e+00,   1.37160000e+00, ...,
         1.93883280e+03,   1.93898520e+03,   1.93913760e+03])

7. Plot part of a curve

We'll grab the interval from 300 m to 1000 m and plot it.


In [17]:
gr.to_basis(start=300, stop=1000).plot()  # Line 8.


8. Smooth a curve

Curve objects are, fundamentally, NumPy arrays. But they have some extra tricks. We've already seen Curve.plot().

Using the Curve.smooth() method, we can easily smooth a curve, eg by 15 m (passing samples=True would smooth by 15 samples):


In [18]:
sm = gr.smooth(window_length=15, samples=False)  # Line 9.

sm.plot()


9. Export a set of curves as a matrix

You can get at all the data through the lasio l.data object:


In [19]:
print("Data shape: {}".format(w.las.data.shape))

w.las.data


Data shape: (12718, 25)
Out[19]:
array([[  1.06680000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.21920000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.37160000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       ..., 
       [  1.93883280e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.30000000e+01],
       [  1.93898520e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.39375000e+01],
       [  1.93913760e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.42500000e+01]])

But we might want to do some other things, such as specify which curves you want (optionally using aliases like GR1, GRC, NGC, etc for GR), resample the data, or specify a start and stop depth — welly can do all this stuff. This method is also wrapped by Project.data_as_matrix() which is nice because it ensures that all the wells are exported at the same sample interval.

Here are the curves in this well:


In [20]:
w.data.keys()


Out[20]:
dict_keys(['CALI', 'HCAL', 'PEF', 'DT', 'DTS', 'DPHI_SAN', 'DPHI_LIM', 'DPHI_DOL', 'NPHI_SAN', 'NPHI_LIM', 'NPHI_DOL', 'RLA5', 'RLA3', 'RLA4', 'RLA1', 'RLA2', 'RXOZ', 'RXO_HRLT', 'RT_HRLT', 'RM_HRLT', 'DRHO', 'RHOB', 'GR', 'SP'])

In [21]:
keys=['CALI', 'DT', 'DTS', 'RHOB', 'SP']

In [22]:
w.plot(tracks=['TVD']+keys)



In [23]:
X, basis = w.data_as_matrix(keys=keys, start=275, stop=1850, step=0.5, return_basis=True)

In [24]:
w.data['CALI'].shape


Out[24]:
(12718,)

So CALI had 12,718 points in it... since we downsampled to 0.5 m and removed the top and tail, we should have substantially fewer points:


In [25]:
X.shape


Out[25]:
(3151, 5)

In [26]:
plt.figure(figsize=(15,3))
plt.plot(X.T[0])
plt.show()


10+. BONUS: fix the lat, lon

OK, we're definitely going to go over our budget on this one.

Did you notice that the location of the well did not get loaded properly?


In [27]:
w.location


Out[27]:
Location({'td': 1935.0, 'crs': CRS({}), 'location': 'Lat = 45* 12\' 34.237" N', 'country': 'CA', 'province': 'Nova Scotia', 'section': '45.20 Deg N', 'range': 'PD 176', 'township': '63.75 Deg W', 'kb': 94.8, 'gl': 90.3, 'tdd': 1935.0, 'tdl': 1935.0, 'deviation': None, 'position': None})

Let's look at some of the header:

# LAS format log file from PETREL
# Project units are specified as depth units
#==================================================================
~Version information
VERS.   2.0:
WRAP.   YES:
#==================================================================
~WELL INFORMATION
#MNEM.UNIT      DATA             DESCRIPTION
#---- ------ --------------   -----------------------------
STRT .M      1.0668          :START DEPTH     
STOP .M      1939.13760      :STOP DEPTH     
STEP .M       0.15240        :STEP        
NULL .          -999.25      :NULL VALUE
COMP .        Elmworth Energy Corporation              :COMPANY
WELL .        Kennetcook #2                            :WELL
FLD  .        Windsor Block                            :FIELD
LOC  .        Lat = 45* 12' 34.237" N                  :LOCATION
PROV .        Nova Scotia                              :PROVINCE
  UWI.        Long = 63* 45'24.460  W                  :UNIQUE WELL ID
LIC  .        P-129                                    :LICENSE NUMBER
CTRY .        CA                                       :COUNTRY (WWW code)
 DATE.        10-Oct-2007                              :LOG DATE {DD-MMM-YYYY}
SRVC .        Schlumberger                             :SERVICE COMPANY
LATI .DEG                                              :LATITUDE
LONG .DEG                                              :LONGITUDE
GDAT .                                                 :GeoDetic Datum
SECT .        45.20 Deg N                              :Section
RANG .        PD 176                                   :Range
TOWN .        63.75 Deg W                              :Township

Look at LOC and UWI. There are two problems:

  1. These items are in the wrong place. (Notice LATI and LONG are empty.)
  2. The items are malformed, with lots of extraneous characters.

We can fix this in two steps:

  1. Remap the header items to fix the first problem.
  2. Parse the items to fix the second one.

We'll define these in reverse because the remapping uses the transforming function.


In [28]:
import re

def transform_ll(text):
    """
    Parses malformed lat and lon so they load properly.
    """
    def callback(match):
        d = match.group(1).strip()
        m = match.group(2).strip()
        s = match.group(3).strip()
        c = match.group(4).strip()
        if c.lower() in ('w', 's') and d[0] != '-':
            d = '-' + d
        return ' '.join([d, m, s])
    pattern = re.compile(r""".+?([-0-9]+?).? ?([0-9]+?).? ?([\.0-9]+?).? +?([NESW])""", re.I)
    text = pattern.sub(callback, text)
    return welly.utils.dms2dd([float(i) for i in text.split()])

Make sure that works!


In [29]:
print(transform_ll("""Lat = 45* 12' 34.237" N"""))


45.20951027777778

In [30]:
remap = {
    'LATI': 'LOC',  # Use LOC for the parameter LATI.
    'LONG': 'UWI',  # Use UWI for the parameter LONG.
    'LOC':  None,   # Use nothing for the parameter SECT.
    'SECT': None,   # Use nothing for the parameter SECT.
    'RANG': None,   # Use nothing for the parameter RANG.
    'TOWN': None,   # Use nothing for the parameter TOWN.
}

funcs = {
    'LATI': transform_ll,  # Pass LATI through this function before loading.
    'LONG': transform_ll,  # Pass LONG through it too.
    'UWI': lambda x: "No UWI, fix this!"
}

In [31]:
w = Well.from_las('../data/P-129.LAS', remap=remap, funcs=funcs)

In [32]:
w.location.latitude, w.location.longitude


Out[32]:
(45.20951027777778, -62.243205555555555)

In [33]:
w.uwi


Out[33]:
'No UWI, fix this!'

Let's just hope the mess is the same mess in every well. (LOL, no-one's that lucky.)


© 2017 agilescientific.com and licensed CC-BY 4.0