Pyflex is a Python port of the FLEXWIN algorithm for automatically selecting windows for seismic tomography. For the most part it mimicks the calculations of the original FLEXWIN package; minor differences and their reasoning are detailed later.
To give credit where credit is due, the original FLEXWIN program can be downloaded here, the corresponding publication is
Maggi, A., Tape, C., Chen, M., Chao, D., & Tromp, J. (2009). An automated time-window selection algorithm for seismic tomography. Geophysical Journal International, 178(1), 257–281 doi:10.1111/j.1365-246X.2009.04099.x
If you use Pyflex, please also cite the latest released version: http://dx.doi.org/10.5281/zenodo.31607
The source code for Pyflex lives on Github: https://github.com/krischer/pyflex. If you encounter any problems with it please open an issue or submit a pull request.
Pyflex utilizes ObsPy (and some of its dependencies) for the processing and data handling. As a first step, please follow the installation instructions of ObsPy for your given platform (we recommend the installation with Anaconda as it will most likely result in the least amount of problems). Pyflex should work with Python versions 2.7, 3.3, and 3.4 (mainly depends on the used ObsPy version). To install it, best use pip:
$ pip install pyflex
If you want the latest development version, or want to work on the code, you will have to install with the help of git.
$ git clone https://github.com/krischer/pyflex.git
$ cd pyflex
$ pip install -v -e .
To assure the installation is valid and everything works as expected, run the tests with
$ python -m pyflex.tests
The first step is to import ObsPy and Pyflex.
In [ ]:
%pylab inline
import obspy
import pyflex
Pyflex expects both observed and synthetic data to already be fully processed. An easy way to accomplish this is to utilize ObsPy. This example reproduces what the original FLEXWIN package does when it is told to also perform the filtering.
In [ ]:
obs_data = obspy.read("../src/pyflex/tests/data/1995.122.05.32.16.0000.II.ABKT.00.LHZ.D.SAC")
synth_data = obspy.read("../src/pyflex/tests/data/ABKT.II.LHZ.semd.sac")
obs_data.detrend("linear")
obs_data.taper(max_percentage=0.05, type="hann")
obs_data.filter("bandpass", freqmin=1.0 / 150.0, freqmax=1.0 / 50.0,
corners=4, zerophase=True)
synth_data.detrend("linear")
synth_data.taper(max_percentage=0.05, type="hann")
synth_data.filter("bandpass", freqmin=1.0 / 150.0, freqmax=1.0 / 50.0,
corners=4, zerophase=True)
In [ ]:
config = pyflex.Config(
min_period=50.0, max_period=150.0,
stalta_waterlevel=0.08, tshift_acceptance_level=15.0,
dlna_acceptance_level=1.0, cc_acceptance_level=0.80,
c_0=0.7, c_1=4.0, c_2=0.0, c_3a=1.0, c_3b=2.0, c_4a=3.0, c_4b=10.0)
In [ ]:
windows = pyflex.select_windows(obs_data, synth_data, config, plot=True)
In [ ]:
import pprint
pprint.pprint(windows[:3])
Each window contains a number of properties that can be used to calculate absolute and relative times for the specific window. left and right specify the boundary indices of a window, absolute_starttime and absolute_endtime are the absolute times of a window's bounds as ObsPy UTCDateTime objects, and relative_starttime and relative_endtime are the bounds of a window in seconds since the first sample of the data arrays (not necessarily identical to the event origin time!).
In [ ]:
win = windows[4]
print("Indices: %s - %s" % (win.left, win.right))
print("Absolute times: %s - %s" % (win.absolute_starttime, win.absolute_endtime))
print("Relative times in seconds: %s - %s" % (win.relative_starttime,
win.relative_endtime))
A window furthermore contains a list of phases theoretically arriving within the window. Take care that the times here are in seconds since the event origin.
In [ ]:
win.phase_arrivals
While Pyflex can also operate without any event and station information, it needs some information to work at full efficiency. In case the waveform traces originate from SAC files, they might contain the necessary information. Pyflex is able to extract that information but this should really be seen as a last resort.
The easiest way to pass the necessary information to Pyflex is to use its very bare bones event and station objects.
In [ ]:
event = pyflex.Event(latitude=-3.77, longitude=-77.07, depth_in_m=112800,
origin_time=obspy.UTCDateTime(1995, 5, 2, 6, 6, 13))
station = pyflex.Station(latitude=37.93, longitude=58.12)
windows = pyflex.select_windows(obs_data, synth_data, config,
event=event, station=station)
In [ ]:
windows = pyflex.select_windows(obs_data, synth_data, config,
windows_filename="windows.json")
The resulting JSON file will have a list of Window objects under the "windows" key.
In [ ]:
ws2 = pyflex.WindowSelector(obs_data, synth_data, config)
print("Windows before loading: %i" % len(ws2.windows))
ws2.load("windows.json")
print("Windows after loading: %i" % len(ws2.windows))
Pyflex largely follows the original FLEXWIN algorithm. The major differences are outlined here.
Pyflex will always find the leftmost index.min_surface_wave_velocity parameter which can be used to easily discard windows for late arriving surface/coda waves.Instead of using the overlap resolution strategy outlined in section 2.5 Stage E in the FLEXWIN paper, Pyflex utilizes weighted interval scheduling which is a classical IT problem. The weighted interval scheduling algorithm finds the best possible subset of non-overlapping windows by maximizing the cumulative weight of all chosen windows. By default the weight of each window is its length in terms of minimum period of the data times the maximum cross correlation coefficient. The weighting function can be overwritten with the help of the Config object. This results in similar windows to the original FLEXWIN algorithm but is easier to reason about.
Pyflex furthermore offers the option to merge all good candidate windows. This is useful for some misfit measurements.
In [ ]:
config = pyflex.Config(
min_period=50.0, max_period=150.0,
stalta_waterlevel=0.08, tshift_acceptance_level=15.0,
dlna_acceptance_level=1.0, cc_acceptance_level=0.80,
c_0=0.7, c_1=4.0, c_2=0.0, c_3a=1.0, c_3b=2.0, c_4a=3.0, c_4b=10.0,
resolution_strategy="merge")
windows = pyflex.select_windows(obs_data, synth_data, config, plot=True)
By default, Pyflex is fairly quiet and will only raise exceptions and warnings in case they occur. Pyflex utilizes Python's logging facilities so if you want more information you can hook into them. This approach is very flexible as it allows you to install custom logging handlers and channels.
In [ ]:
import logging
logger = logging.getLogger("pyflex")
logger.setLevel(logging.DEBUG)
windows = pyflex.select_windows(obs_data, synth_data, config)