Seismic acquisition fiddling

This notebook is to accompany the blog post published on January 8, 2015: It goes in the bin at Agile Geoscience.

The idea is to replicate what we've done so far but with 3 enhancements:

With a Survey object to hold the various features of a survey.
With more GeoPandas stuff, and less fussing with (x,y)'s directly.
Making bins and assigning midpoints to them.

We'll start with the usual prelims...



In [1]:

    
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point, LineString
import geopandas as gpd
import pandas as pd
from fiona.crs import from_epsg

%matplotlib inline

Survey object



In [2]:

    
class Survey:
    """
    A seismic survey.
    """

    def __init__(self, params):
        
        # Assign the variables from the parameter dict,
        # using dict.items() for Python 3 compatibility.
        for k, v in params.items(): 
            setattr(self, k, v)
          
        # These are just a convenience; we could use the
        # tuples directly, or make objects with attrs.
        self.xmi = self.corner[0]
        self.ymi = self.corner[1]
        
        self.x = self.size[0]
        self.y = self.size[1]
        
        self.SL = self.line_spacing[0]
        self.RL = self.line_spacing[1]
        
        self.si = self.point_spacing[0]
        self.ri = self.point_spacing[1]
        
        self.shiftx = -self.si/2.
        self.shifty = -self.ri/2.
           
    @property
    def lines(self):
        """
        Returns number of (src, rcvr) lines.
        """
        slines = int(self.x/self.SL) + 1
        rlines = int(self.y/self.RL) + 1
        return slines, rlines

    @property
    def points_per_line(self):
        """
        Returns number of (src, rcvr) points per line.
        """
        spoints = int(self.y/self.si) + 2
        rpoints = int(self.x/self.ri) + 2
        return spoints, rpoints
    
    @property
    def src(self):
        s = [Point(self.xmi+line*self.SL, self.ymi+s*self.si)
             for line in range(self.lines[0])
             for s in range(self.points_per_line[0])
             ]
        S = gpd.GeoSeries(s)
        S.crs = from_epsg(26911)
        return S

    @property
    def rcvr(self):
        r = [Point(self.xmi + r*self.ri + self.shiftx, self.ymi + line*self.RL - self.shifty)
             for line in range(self.lines[1])
             for r in range(self.points_per_line[1])
             ]
        R = gpd.GeoSeries(r)
        R.crs = from_epsg(self.epsg)
        return R
    
    @property
    def layout(self):
        """
        Provide a GeoDataFrame of all points,
        labelled as columns and in hierarchical index.
        """
        # Feels like there might be a better way to do this...
        sgdf = gpd.GeoDataFrame({'geometry': self.src, 'station': 'src'})
        rgdf = gpd.GeoDataFrame({'geometry': self.rcvr, 'station': 'rcvr'})

        # Concatenate with a hierarchical index
        layout = pd.concat([sgdf,rgdf], keys=['sources','receivers'])
        layout.crs = from_epsg(self.epsg)

        return layout

Perhaps s and r should be objects too. I think you might want to have survey.receivers.x for the list of x locations, for example.

Instantiate and plot



In [3]:

    
params = {'corner': (5750000,4710000),
          'size': (3000,1800),
          'line_spacing': (600,600),
          'point_spacing': (100,100),
          'epsg': 26911 # http://spatialreference.org/ref/epsg/26911/
          }

survey = Survey(params)



In [4]:

    
s = survey.src
r = survey.rcvr
r[:10]









    Out[4]:





0    POINT (5749950 4710050)
1    POINT (5750050 4710050)
2    POINT (5750150 4710050)
3    POINT (5750250 4710050)
4    POINT (5750350 4710050)
5    POINT (5750450 4710050)
6    POINT (5750550 4710050)
7    POINT (5750650 4710050)
8    POINT (5750750 4710050)
9    POINT (5750850 4710050)
dtype: object



In [5]:

    
layout = survey.layout
layout[:10]









    Out[5]:






  
    
      
      
      geometry
      station
    
  
  
    
      sources
      0
       POINT (5750000 4710000)
       src
    
    
      1
       POINT (5750000 4710100)
       src
    
    
      2
       POINT (5750000 4710200)
       src
    
    
      3
       POINT (5750000 4710300)
       src
    
    
      4
       POINT (5750000 4710400)
       src
    
    
      5
       POINT (5750000 4710500)
       src
    
    
      6
       POINT (5750000 4710600)
       src
    
    
      7
       POINT (5750000 4710700)
       src
    
    
      8
       POINT (5750000 4710800)
       src
    
    
      9
       POINT (5750000 4710900)
       src

With a hierarchical index you can do cool things, e.g. show the last five sources:



In [6]:

    
layout.ix['sources'][-5:]









    Out[6]:






  
    
      
      geometry
      station
    
  
  
    
      115
       POINT (5753000 4711500)
       src
    
    
      116
       POINT (5753000 4711600)
       src
    
    
      117
       POINT (5753000 4711700)
       src
    
    
      118
       POINT (5753000 4711800)
       src
    
    
      119
       POINT (5753000 4711900)
       src



In [7]:

    
layout.crs









    Out[7]:





{'init': 'epsg:26911', 'no_defs': True}



In [8]:

    
ax = layout.plot()

Export GeoDataFrames to GIS shapefile.



In [9]:

    
# gdf.to_file('src_and_rcvr.shp')

Midpoint calculations

We need midpoints. There is a midpoint between every source-receiver pair.

Hopefully it's not too inelegant to get to the midpoints now that we're using this layout object thing.



In [10]:

    
midpoint_list = [LineString([r, s]).interpolate(0.5, normalized=True)
                  for r in layout.ix['receivers'].geometry
                  for s in layout.ix['sources'].geometry
                  ]

As well as knowing the (x,y) of the midpoints, we'd also like to record the distance from each s to each live r (each r in the live patch). This is easy enough to compute:

Point(x1, y1).distance(Point(x2, y2))

Then we can make a list of all the offsets when we count the midpoints into the bins.



In [11]:

    
offsets = [r.distance(s)
           for r in layout.ix['receivers'].geometry
           for s in layout.ix['sources'].geometry
           ]



In [12]:

    
azimuths = [(180.0/np.pi) * np.arctan((r.x - s.x)/(r.y - s.y))
            for r in layout.ix['receivers'].geometry
            for s in layout.ix['sources'].geometry
            ]



In [13]:

    
offsetx = np.array(offsets)*np.cos(np.array(azimuths)*np.pi/180.)
offsety = np.array(offsets)*np.sin(np.array(azimuths)*np.pi/180.)

Make a Geoseries of the midpoints, offsets and azimths:



In [14]:

    
midpoints = gpd.GeoDataFrame({
                   'geometry' : midpoint_list,
                   'offset' : offsets,
                   'azimuth': azimuths,
                   'offsetx' : offsetx,
                   'offsety' : offsety
                   })
midpoints[:5]









    Out[14]:






  
    
      
      azimuth
      geometry
      offset
      offsetx
      offsety
    
  
  
    
      0
      -45.000000
       POINT (5749975 4710025)
        70.710678
        50
      -50
    
    
      1
       45.000000
       POINT (5749975 4710075)
        70.710678
        50
       50
    
    
      2
       18.434949
       POINT (5749975 4710125)
       158.113883
       150
       50
    
    
      3
       11.309932
       POINT (5749975 4710175)
       254.950976
       250
       50
    
    
      4
        8.130102
       POINT (5749975 4710225)
       353.553391
       350
       50



In [15]:

    
ax = midpoints.plot()

Save to a shapefile if desired.



In [16]:

    
#midpt.to_file('CMPs.shp')

Spider plot



In [17]:

    
midpoints[:5].offsetx # Easy!









    Out[17]:





0     50
1     50
2    150
3    250
4    350
Name: offsetx, dtype: float64



In [18]:

    
midpoints.ix[3].geometry.x # Less easy :(









    Out[18]:





5749975.0

We need lists (or arrays) to pass into the matplotlib quiver plot. This takes four main parameters: x, y, u, and v, where x, y will be our coordinates, and u, v will be the offset vector for that midpoint.

We can get at the GeoDataFrame's attributes easily, but I can't see how to get at the coordinates in the geometry GeoSeries (seems like a user error — it feels like it should be really easy) so I am resorting to this:



In [19]:

    
x = [m.geometry.x for i, m in midpoints.iterrows()]
y = [m.geometry.y for i, m in midpoints.iterrows()]



In [20]:

    
fig = plt.figure(figsize=(12,8))
plt.quiver(x, y, midpoints.offsetx, midpoints.offsety, units='xy', width=0.5, scale=1/0.025, pivot='mid', headlength=0)
plt.axis('equal')
plt.show()

Bins

The bins are a new geometry, related to but separate from the survey itself, and the midpoints. We will model them as a GeoDataFrame of polygons. The steps are:

Compute the bin centre locations with our usual list comprehension trick.
Buffer the centres with a square.
Gather the buffered polygons into a GeoDataFrame.



In [21]:

    
# Factor to shift the bins relative to source and receiver points
jig = survey.si / 4.
bin_centres = gpd.GeoSeries([Point(survey.xmi + 0.5*r*survey.ri - jig, survey.ymi + 0.5*s*survey.si + jig)
                             for r in range(2*(survey.points_per_line[1]-1))
                             for s in range(2*(survey.points_per_line[0]-1))
                            ])

# Buffers are diamond shaped so we have to scale and rotate them.
scale_factor = np.sin(np.pi/4.)/2.
bin_polys = bin_centres.buffer(scale_factor*survey.ri, 1).rotate(-45)
bins = gpd.GeoDataFrame(geometry=bin_polys)

bins[:3]









    Out[21]:






  
    
      
      geometry
    
  
  
    
      0
       POLYGON ((5750000 4709999.999999999, 5749950 4...
    
    
      1
       POLYGON ((5750000 4710049.999999999, 5749950 4...
    
    
      2
       POLYGON ((5750000 4710100, 5749950 4710100, 57...

Suspect there's a super easy way to get all midpoints in a bin poly, without stepping over all bins.

WARNING This step is very slow for more than a few thousand midpoints.



In [22]:

    
# Make a copy because I'm going to drop points as I
# assign them to polys, to speed up subsequent search.
midpts = midpoints.copy()

offsets, azimuths = [], [] # To hold complete list.

# Loop over bin polygons with index i.
for i, bin_i in bins.iterrows():
    
    o, a = [], [] # To hold list for this bin only.
    
    # Now loop over all midpoints with index j.
    for j, midpt_j in midpts.iterrows():
        if bin_i.geometry.contains(midpt_j.geometry):
            # Then it's a hit! Add it to the lists,
            # and drop it so we have less hunting.
            o.append(midpt_j.offset)
            a.append(midpt_j.azimuth)
            midpts = midpts.drop([j])
            
    # Add the bin_i lists to the master list
    # and go around the outer loop again.
    offsets.append(o)
    azimuths.append(a)
    
# Add everything to the dataframe.    
bins['offsets'] = gpd.GeoSeries(offsets)
bins['azimuths'] = gpd.GeoSeries(azimuths)



In [23]:

    
bins[:10]









    Out[23]:






  
    
      
      geometry
      offsets
      azimuths
    
  
  
    
      0
       POLYGON ((5750000 4709999.999999999, 5749950 4...
                      [70.7106781187]
                               [-45.0]
    
    
      1
       POLYGON ((5750000 4710049.999999999, 5749950 4...
                      [70.7106781187]
                                [45.0]
    
    
      2
       POLYGON ((5750000 4710100, 5749950 4710100, 57...
                      [158.113883008]
                       [18.4349488229]
    
    
      3
       POLYGON ((5750000 4710150, 5749950 4710150, 57...
                       [254.95097568]
                        [11.309932474]
    
    
      4
       POLYGON ((5750000 4710200, 5749949.999999999 4...
                      [353.553390593]
                       [8.13010235416]
    
    
      5
       POLYGON ((5750000 4710250, 5749950 4710250, 57...
                      [452.769256907]
                       [6.34019174591]
    
    
      6
       POLYGON ((5750000 4710300, 5749950 4710300, 57...
        [552.268050859, 651.92024052]
         [5.19442890773, -4.398705355]
    
    
      7
       POLYGON ((5750000 4710349.999999999, 5749950 4...
        [651.92024052, 552.268050859]
         [4.398705355, -5.19442890773]
    
    
      8
       POLYGON ((5750000 4710400, 5749950 4710400, 57...
       [751.664818919, 452.769256907]
       [3.81407483429, -6.34019174591]
    
    
      9
       POLYGON ((5750000 4710450, 5749950 4710450, 57...
       [851.469318296, 353.553390593]
       [3.36646066343, -8.13010235416]

We can compute the fold from the length of the list of offsets in each bin. We use a mini-function, called a lambda, to do this. This piece of code applies a lambda to each row in the GeoDataFrame. Essentially it says:

set each row in the 'fold' column in my `bins` GeoDataFrame to the length of the offsets list for that row.



In [24]:

    
bins['fold'] = bins.apply(lambda row: len(row.offsets), axis=1)

Now we can use the GeoDataFrame's built-in plot() method to plot these:



In [25]:

    
ax = bins.plot(column="fold")

We can use a similar trick to compute the minimum offset, but with an added test for there being valid data in the bin:



In [26]:

    
bins['min_offset'] = bins.apply(lambda row: min(row.offsets) if row.fold > 0 else None, axis=1)



In [27]:

    
ax = bins.plot(column="min_offset")



In [ ]:

		geometry	station
sources	0	POINT (5750000 4710000)	src
	1	POINT (5750000 4710100)	src
	2	POINT (5750000 4710200)	src
	3	POINT (5750000 4710300)	src
	4	POINT (5750000 4710400)	src
	5	POINT (5750000 4710500)	src
	6	POINT (5750000 4710600)	src
	7	POINT (5750000 4710700)	src
	8	POINT (5750000 4710800)	src
	9	POINT (5750000 4710900)	src

	geometry	station
115	POINT (5753000 4711500)	src
116	POINT (5753000 4711600)	src
117	POINT (5753000 4711700)	src
118	POINT (5753000 4711800)	src
119	POINT (5753000 4711900)	src

	azimuth	geometry	offset	offsetx	offsety
0	-45.000000	POINT (5749975 4710025)	70.710678	50	-50
1	45.000000	POINT (5749975 4710075)	70.710678	50	50
2	18.434949	POINT (5749975 4710125)	158.113883	150	50
3	11.309932	POINT (5749975 4710175)	254.950976	250	50
4	8.130102	POINT (5749975 4710225)	353.553391	350	50

	geometry
0	POLYGON ((5750000 4709999.999999999, 5749950 4...
1	POLYGON ((5750000 4710049.999999999, 5749950 4...
2	POLYGON ((5750000 4710100, 5749950 4710100, 57...

	geometry	offsets	azimuths
0	POLYGON ((5750000 4709999.999999999, 5749950 4...	[70.7106781187]	[-45.0]
1	POLYGON ((5750000 4710049.999999999, 5749950 4...	[70.7106781187]	[45.0]
2	POLYGON ((5750000 4710100, 5749950 4710100, 57...	[158.113883008]	[18.4349488229]
3	POLYGON ((5750000 4710150, 5749950 4710150, 57...	[254.95097568]	[11.309932474]
4	POLYGON ((5750000 4710200, 5749949.999999999 4...	[353.553390593]	[8.13010235416]
5	POLYGON ((5750000 4710250, 5749950 4710250, 57...	[452.769256907]	[6.34019174591]
6	POLYGON ((5750000 4710300, 5749950 4710300, 57...	[552.268050859, 651.92024052]	[5.19442890773, -4.398705355]
7	POLYGON ((5750000 4710349.999999999, 5749950 4...	[651.92024052, 552.268050859]	[4.398705355, -5.19442890773]
8	POLYGON ((5750000 4710400, 5749950 4710400, 57...	[751.664818919, 452.769256907]	[3.81407483429, -6.34019174591]
9	POLYGON ((5750000 4710450, 5749950 4710450, 57...	[851.469318296, 353.553390593]	[3.36646066343, -8.13010235416]