x lines of Python

Reading and writing LAS files

This notebook goes with the Agile blog post of 23 October.

Set up a conda environment with:

conda create -n welly python=3.6 matplotlib=2.0 scipy pandas

You'll need welly in your environment:

conda install tqdm  # Should happen automatically but doesn't
pip install welly

This will also install the latest versions of striplog and lasio.



In [1]:

    
import welly



In [2]:

    
ls ../data/*.LAS









    



../data/P-129.LAS

1. Load the LAS file with `lasio`



In [3]:

    
import lasio

l = lasio.read('../data/P-129.LAS')  # Line 1.

That's it! But the object itself doesn't tell us much — it's really just a container:



In [4]:

    
l









    Out[4]:





<lasio.las.LASFile at 0x1809dd3b70>

2. Look at the WELL section of the header



In [5]:

    
l.header['Well']  # Line 2.









    Out[5]:





[HeaderItem(mnemonic=STRT, unit=M, value=1.0668, descr=START DEPTH, original_mnemonic=STRT),
 HeaderItem(mnemonic=STOP, unit=M, value=1939.1376, descr=STOP DEPTH, original_mnemonic=STOP),
 HeaderItem(mnemonic=STEP, unit=M, value=0.1524, descr=STEP, original_mnemonic=STEP),
 HeaderItem(mnemonic=NULL, unit=, value=-999.25, descr=NULL VALUE, original_mnemonic=NULL),
 HeaderItem(mnemonic=COMP, unit=, value=Elmworth Energy Corporation, descr=COMPANY, original_mnemonic=COMP),
 HeaderItem(mnemonic=WELL, unit=, value=Kennetcook #2, descr=WELL, original_mnemonic=WELL),
 HeaderItem(mnemonic=FLD, unit=, value=Windsor Block, descr=FIELD, original_mnemonic=FLD),
 HeaderItem(mnemonic=LOC, unit=, value=Lat = 45* 12' 34.237" N, descr=LOCATION, original_mnemonic=LOC),
 HeaderItem(mnemonic=PROV, unit=, value=Nova Scotia, descr=PROVINCE, original_mnemonic=PROV),
 HeaderItem(mnemonic=UWI, unit=, value=Long = 63* 45'24.460  W, descr=UNIQUE WELL ID, original_mnemonic=UWI),
 HeaderItem(mnemonic=LIC, unit=, value=P-129, descr=LICENSE NUMBER, original_mnemonic=LIC),
 HeaderItem(mnemonic=CTRY, unit=, value=CA, descr=COUNTRY (WWW code), original_mnemonic=CTRY),
 HeaderItem(mnemonic=DATE, unit=, value=10-Oct-2007, descr=LOG DATE {DD-MMM-YYYY}, original_mnemonic=DATE),
 HeaderItem(mnemonic=SRVC, unit=, value=Schlumberger, descr=SERVICE COMPANY, original_mnemonic=SRVC),
 HeaderItem(mnemonic=LATI, unit=DEG, value=, descr=LATITUDE, original_mnemonic=LATI),
 HeaderItem(mnemonic=LONG, unit=DEG, value=, descr=LONGITUDE, original_mnemonic=LONG),
 HeaderItem(mnemonic=GDAT, unit=, value=, descr=GeoDetic Datum, original_mnemonic=GDAT),
 HeaderItem(mnemonic=SECT, unit=, value=45.20 Deg N, descr=Section, original_mnemonic=SECT),
 HeaderItem(mnemonic=RANG, unit=, value=PD 176, descr=Range, original_mnemonic=RANG),
 HeaderItem(mnemonic=TOWN, unit=, value=63.75 Deg W, descr=Township, original_mnemonic=TOWN)]

You can go in and find the KB if you know what to look for:



In [6]:

    
l.header['Parameter']['EKB']









    Out[6]:





HeaderItem(mnemonic=EKB, unit=M, value=94.8, descr=Elevation of Kelly Bushing, original_mnemonic=EKB)

3. Look at the curve data

The curves are all present one big NumPy array:



In [7]:

    
l.data









    Out[7]:





array([[  1.06680000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.21920000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.37160000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       ..., 
       [  1.93883280e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.30000000e+01],
       [  1.93898520e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.39375000e+01],
       [  1.93913760e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.42500000e+01]])

Or we can go after a single curve object:



In [8]:

    
l.curves.GR  # Line 3.









    Out[8]:





CurveItem(mnemonic=GR, unit=gAPI, value=, descr=Gamma-Ray, original_mnemonic=GR, data.shape=(12718,))

And there's a shortcut to its data:



In [9]:

    
l['GR']  # Line 4.









    Out[9]:





array([ 46.69865036,  46.69865036,  46.69865036, ...,  92.24622345,
        92.24622345,  92.24622345])

...so it's easy to make a plot against depth:



In [10]:

    
import matplotlib.pyplot as plt
%matplotlib inline

plt.figure(figsize=(15,3))
plt.plot(l['DEPT'], l['GR'])
plt.show()

4. Inspect the curves as a `pandas` dataframe



In [11]:

    
l.df().head()  # Line 5.









    Out[11]:







  
    
      
      CALI
      HCAL
      PEF
      DT
      DTS
      DPHI_SAN
      DPHI_LIM
      DPHI_DOL
      NPHI_SAN
      NPHI_LIM
      ...
      RLA1
      RLA2
      RXOZ
      RXO_HRLT
      RT_HRLT
      RM_HRLT
      DRHO
      RHOB
      GR
      SP
    
    
      DEPT
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1.0668
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.2192
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.3716
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.5240
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.6764
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
  

5 rows × 24 columns

5. Load the LAS file with `welly`



In [12]:

    
from welly import Well

w = Well.from_las('../data/P-129.LAS')  # Line 6.

welly Wells know how to display some basics:



In [13]:

    
w









    Out[13]:




Kennetcook #2
Long = 63* 45'24.460  W
td 1935.0
crs CRS({})
location Lat = 45* 12' 34.237" N
country CA
province Nova Scotia
section 45.20 Deg N
range PD 176
township 63.75 Deg W
kb 94.8
gl 90.3
tdd 1935.0
tdl 1935.0
data CALI, DPHI_DOL, DPHI_LIM, DPHI_SAN, DRHO, DT, DTS, GR, HCAL, NPHI_DOL, NPHI_LIM, NPHI_SAN, PEF, RHOB, RLA1, RLA2, RLA3, RLA4, RLA5, RM_HRLT, RT_HRLT, RXOZ, RXO_HRLT, SP

And the Well object also has lasio's access to a pandas DataFrame:



In [14]:

    
w.df().head()









    Out[14]:







  
    
      
      CALI
      HCAL
      PEF
      DT
      DTS
      DPHI_SAN
      DPHI_LIM
      DPHI_DOL
      NPHI_SAN
      NPHI_LIM
      ...
      RLA1
      RLA2
      RXOZ
      RXO_HRLT
      RT_HRLT
      RM_HRLT
      DRHO
      RHOB
      GR
      SP
    
    
      DEPT
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1.0668
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.2192
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.3716
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.5240
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
    
      1.6764
      2.443815
      4.391285
      3.5864
      NaN
      NaN
      0.15748
      0.19844
      0.2591
      0.4651
      0.33647
      ...
      0.0321
      0.02794
      0.05761
      0.02558
      0.02558
      0.05501
      0.194233
      2.39015
      46.69865
      120.125
    
  

5 rows × 24 columns

6. Look at `welly`'s Curve object

Like the Well, a Curve object can report a bit about itself:



In [15]:

    
gr = w.data['GR']  # Line 7.
gr









    Out[15]:




GR [gAPI]
1.0668 : 1939.2900 : 0.1524
run 1
null -999.25
service_company Schlumberger
date 10-Oct-2007
code 
description Gamma-Ray
Stats
samples (NaNs) 12718 (0)
_min mean ^max _3.89 78.986 ^267.94
Depth Value
1.0668 46.6987
1.2192 46.6987
1.3716 46.6987
⋮ ⋮
1938.8328 92.2462
1938.9852 92.2462
1939.1376 92.2462

One important thing about Curves is that each one knows its own depths — they are stored as a property called basis. (It's not actually stored, but computed on demand from the start depth, the sample interval (which must be constant for the whole curve) and the number of samples in the object.)



In [16]:

    
gr.basis









    Out[16]:





array([  1.06680000e+00,   1.21920000e+00,   1.37160000e+00, ...,
         1.93883280e+03,   1.93898520e+03,   1.93913760e+03])

7. Plot part of a curve

We'll grab the interval from 300 m to 1000 m and plot it.



In [17]:

    
gr.to_basis(start=300, stop=1000).plot()  # Line 8.

8. Smooth a curve

Curve objects are, fundamentally, NumPy arrays. But they have some extra tricks. We've already seen Curve.plot().

Using the Curve.smooth() method, we can easily smooth a curve, eg by 15 m (passing samples=True would smooth by 15 samples):



In [18]:

    
sm = gr.smooth(window_length=15, samples=False)  # Line 9.

sm.plot()

9. Export a set of curves as a matrix

You can get at all the data through the lasio l.data object:



In [19]:

    
print("Data shape: {}".format(w.las.data.shape))

w.las.data









    



Data shape: (12718, 25)






    Out[19]:





array([[  1.06680000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.21920000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       [  1.37160000e+00,   2.44381547e+00,   4.39128494e+00, ...,
          2.39014983e+00,   4.66986504e+01,   1.20125000e+02],
       ..., 
       [  1.93883280e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.30000000e+01],
       [  1.93898520e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.39375000e+01],
       [  1.93913760e+03,   2.42026806e+00,              nan, ...,
                     nan,   9.22462235e+01,   7.42500000e+01]])

But we might want to do some other things, such as specify which curves you want (optionally using aliases like GR1, GRC, NGC, etc for GR), resample the data, or specify a start and stop depth — welly can do all this stuff. This method is also wrapped by Project.data_as_matrix() which is nice because it ensures that all the wells are exported at the same sample interval.

Here are the curves in this well:



In [20]:

    
w.data.keys()









    Out[20]:





dict_keys(['CALI', 'HCAL', 'PEF', 'DT', 'DTS', 'DPHI_SAN', 'DPHI_LIM', 'DPHI_DOL', 'NPHI_SAN', 'NPHI_LIM', 'NPHI_DOL', 'RLA5', 'RLA3', 'RLA4', 'RLA1', 'RLA2', 'RXOZ', 'RXO_HRLT', 'RT_HRLT', 'RM_HRLT', 'DRHO', 'RHOB', 'GR', 'SP'])



In [21]:

    
keys=['CALI', 'DT', 'DTS', 'RHOB', 'SP']



In [22]:

    
w.plot(tracks=['TVD']+keys)



In [23]:

    
X, basis = w.data_as_matrix(keys=keys, start=275, stop=1850, step=0.5, return_basis=True)



In [24]:

    
w.data['CALI'].shape









    Out[24]:





(12718,)

So CALI had 12,718 points in it... since we downsampled to 0.5 m and removed the top and tail, we should have substantially fewer points:



In [25]:

    
X.shape









    Out[25]:





(3151, 5)



In [26]:

    
plt.figure(figsize=(15,3))
plt.plot(X.T[0])
plt.show()

10+. BONUS: fix the lat, lon

OK, we're definitely going to go over our budget on this one.

Did you notice that the location of the well did not get loaded properly?



In [27]:

    
w.location









    Out[27]:





Location({'td': 1935.0, 'crs': CRS({}), 'location': 'Lat = 45* 12\' 34.237" N', 'country': 'CA', 'province': 'Nova Scotia', 'section': '45.20 Deg N', 'range': 'PD 176', 'township': '63.75 Deg W', 'kb': 94.8, 'gl': 90.3, 'tdd': 1935.0, 'tdl': 1935.0, 'deviation': None, 'position': None})

Let's look at some of the header:

# LAS format log file from PETREL
# Project units are specified as depth units
#==================================================================
~Version information
VERS.   2.0:
WRAP.   YES:
#==================================================================
~WELL INFORMATION
#MNEM.UNIT      DATA             DESCRIPTION
#---- ------ --------------   -----------------------------
STRT .M      1.0668          :START DEPTH     
STOP .M      1939.13760      :STOP DEPTH     
STEP .M       0.15240        :STEP        
NULL .          -999.25      :NULL VALUE
COMP .        Elmworth Energy Corporation              :COMPANY
WELL .        Kennetcook #2                            :WELL
FLD  .        Windsor Block                            :FIELD
LOC  .        Lat = 45* 12' 34.237" N                  :LOCATION
PROV .        Nova Scotia                              :PROVINCE
  UWI.        Long = 63* 45'24.460  W                  :UNIQUE WELL ID
LIC  .        P-129                                    :LICENSE NUMBER
CTRY .        CA                                       :COUNTRY (WWW code)
 DATE.        10-Oct-2007                              :LOG DATE {DD-MMM-YYYY}
SRVC .        Schlumberger                             :SERVICE COMPANY
LATI .DEG                                              :LATITUDE
LONG .DEG                                              :LONGITUDE
GDAT .                                                 :GeoDetic Datum
SECT .        45.20 Deg N                              :Section
RANG .        PD 176                                   :Range
TOWN .        63.75 Deg W                              :Township

Look at LOC and UWI. There are two problems:

These items are in the wrong place. (Notice LATI and LONG are empty.)
The items are malformed, with lots of extraneous characters.

We can fix this in two steps:

Remap the header items to fix the first problem.
Parse the items to fix the second one.

We'll define these in reverse because the remapping uses the transforming function.



In [28]:

    
import re

def transform_ll(text):
    """
    Parses malformed lat and lon so they load properly.
    """
    def callback(match):
        d = match.group(1).strip()
        m = match.group(2).strip()
        s = match.group(3).strip()
        c = match.group(4).strip()
        if c.lower() in ('w', 's') and d[0] != '-':
            d = '-' + d
        return ' '.join([d, m, s])
    pattern = re.compile(r""".+?([-0-9]+?).? ?([0-9]+?).? ?([\.0-9]+?).? +?([NESW])""", re.I)
    text = pattern.sub(callback, text)
    return welly.utils.dms2dd([float(i) for i in text.split()])

Make sure that works!



In [29]:

    
print(transform_ll("""Lat = 45* 12' 34.237" N"""))









    



45.20951027777778



In [30]:

    
remap = {
    'LATI': 'LOC',  # Use LOC for the parameter LATI.
    'LONG': 'UWI',  # Use UWI for the parameter LONG.
    'LOC':  None,   # Use nothing for the parameter SECT.
    'SECT': None,   # Use nothing for the parameter SECT.
    'RANG': None,   # Use nothing for the parameter RANG.
    'TOWN': None,   # Use nothing for the parameter TOWN.
}

funcs = {
    'LATI': transform_ll,  # Pass LATI through this function before loading.
    'LONG': transform_ll,  # Pass LONG through it too.
    'UWI': lambda x: "No UWI, fix this!"
}



In [31]:

    
w = Well.from_las('../data/P-129.LAS', remap=remap, funcs=funcs)



In [32]:

    
w.location.latitude, w.location.longitude









    Out[32]:





(45.20951027777778, -62.243205555555555)



In [33]:

    
w.uwi









    Out[33]:





'No UWI, fix this!'

Let's just hope the mess is the same mess in every well. (LOL, no-one's that lucky.)

	CALI	HCAL	PEF	DT	DTS	DPHI_SAN	DPHI_LIM	DPHI_DOL	NPHI_SAN	NPHI_LIM	...	RLA1	RLA2	RXOZ	RXO_HRLT	RT_HRLT	RM_HRLT	DRHO	RHOB	GR	SP
DEPT
1.0668	2.443815	4.391285	3.5864	NaN	NaN	0.15748	0.19844	0.2591	0.4651	0.33647	...	0.0321	0.02794	0.05761	0.02558	0.02558	0.05501	0.194233	2.39015	46.69865	120.125
1.2192	2.443815	4.391285	3.5864	NaN	NaN	0.15748	0.19844	0.2591	0.4651	0.33647	...	0.0321	0.02794	0.05761	0.02558	0.02558	0.05501	0.194233	2.39015	46.69865	120.125
1.3716	2.443815	4.391285	3.5864	NaN	NaN	0.15748	0.19844	0.2591	0.4651	0.33647	...	0.0321	0.02794	0.05761	0.02558	0.02558	0.05501	0.194233	2.39015	46.69865	120.125
1.5240	2.443815	4.391285	3.5864	NaN	NaN	0.15748	0.19844	0.2591	0.4651	0.33647	...	0.0321	0.02794	0.05761	0.02558	0.02558	0.05501	0.194233	2.39015	46.69865	120.125
1.6764	2.443815	4.391285	3.5864	NaN	NaN	0.15748	0.19844	0.2591	0.4651	0.33647	...	0.0321	0.02794	0.05761	0.02558	0.02558	0.05501	0.194233	2.39015	46.69865	120.125

Kennetcook #2 Long = 63* 45'24.460 W
td	1935.0
crs	CRS({})
location	Lat = 45* 12' 34.237" N
country	CA
province	Nova Scotia
section	45.20 Deg N
range	PD 176
township	63.75 Deg W
kb	94.8
gl	90.3
tdd	1935.0
tdl	1935.0
data	CALI, DPHI_DOL, DPHI_LIM, DPHI_SAN, DRHO, DT, DTS, GR, HCAL, NPHI_DOL, NPHI_LIM, NPHI_SAN, PEF, RHOB, RLA1, RLA2, RLA3, RLA4, RLA5, RM_HRLT, RT_HRLT, RXOZ, RXO_HRLT, SP

GR [gAPI]
1.0668 : 1939.2900 : 0.1524
run	1
null	-999.25
service_company	Schlumberger
date	10-Oct-2007
code
description	Gamma-Ray
Stats
samples (NaNs)	12718 (0)
_min mean ^max	_3.89 78.986 ^267.94
Depth	Value
1.0668	46.6987
1.2192	46.6987
1.3716	46.6987
⋮	⋮
1938.8328	92.2462
1938.9852	92.2462
1939.1376	92.2462