Reproducible LMA research with the IPython notebook and brawl4d

This notebook demonstrates how to download and display data from the NSF-sponsored Deep Convective Clouds and Chemistry campaign.

To download data, go to the DC3 data archive and choose one of the LMA datasets. This example will assume use of the LMA VHF source and flash data on June 4, 2012 from 2050-2100 UTC from West Texas, as retrieved using the DC3 data download form.

Note: the hour of data below is 960 MB, so caveat downloader. Try just the 2050--2100 interval.

Running the code below is some basic boilerplate.

**note: some issues exist due to issues with the lasso tool in pylab, the matplotlib qt4 backend works but may take a couple tries to run the browser.


In [1]:
%matplotlib qt4
import matplotlib
#matplotlib.use('Qt4Agg')
#import matplotlib.pyplot as plt
from brawl4d.brawl4d import B4D_startup, redraw


/Users/Salinas/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:1155: UserWarning:  This call to matplotlib.use() has no effect
because the backend has already been chosen;
matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.

  warnings.warn(_use_error_msg)

In the cell below, note that the basedate has been set to match the dataset we downloaded above.

If you are not using data from the WTLMA, then you'll also need to pass ctr_lon=value and ctr_lat=value to B4D_startup.


In [3]:
from datetime import datetime
panels = B4D_startup(basedate=datetime(2013,6,6), ctr_lat=33.5, ctr_lon=-101.5)


/Users/Salinas/anaconda/lib/python2.7/site-packages/matplotlib/font_manager.py:1236: UserWarning: findfont: Font family ['Helvetica'] not found. Falling back to Bitstream Vera Sans
  (prop.get_family(), self.defaultFamily[fontext]))

Below, set a valid path to lma_file. The IPython notebook will try to tab-complete paths.


In [4]:
from brawl4d.LMA.controller import LMAController
lma_file = '/Users/Salinas/code(Canopy)/LMA/LYLOUT_130606_033000_0600.dat.flash.h5'
lma_ctrl = LMAController()
d, post_filter_brancher, scatter_ctrl, charge_lasso = lma_ctrl.load_hdf5_to_panels(panels, lma_file)


found flash data

Zoom in on a few cells of interest. The smaller, western and northern cells here are anomalously electrified, while the larger cluster is normally electrified.


In [5]:
panels.panels['tz'].axis((3*3600 + 30*60, 3*3600 + 30*60+5, 1, 15))
panels.panels['xy'].axis((-100, 0, 0, 90))


Out[5]:
(-100, 0, 0, 90)

The following cell contains and combines all functions of Brawl4d into a single .py file. This allows all tools to be activiated simultaneously, while also enabling a centralized container for all active widgets for convienence.

LMA Tools Contained:

Number of Stations: Specify accordingly with data file (min=1; max=11)

Max Chi2: Values for chi2 obtained from the data file (min=0.0; max=1.0)

Charge Selection: Selection for Negative (-1), Neutral (0), and Positive (1) charge for charge selection and analyzation in the browser. The draw button activates the lasso tool enabling charge selection; re-clicking the draw button is necessary upon each selection made.

Color By: Allows the display of LMA data by chi2, time, or charge; selecting one will redraw the plot in the browser.

Animation Time: Allows for animation of the LMA data in the browser for charge polarity determination made by Charge Selection. The slider allows the user to select a desired time for total animation duration (min=1s, max=30s). Clicking Animate will then animate the data after a desired time has been selected.


In [6]:
from brawl4d.LMA.widgets import LMAwidgetController
from IPython.display import display
from brawl4d.LMA.controller import LMAController

lma_tools = LMAwidgetController(panels, lma_ctrl, scatter_ctrl, charge_lasso, d)
display(lma_tools.tools_popup)

Charge analysis

It's possible to use a lasso to classify charge regions inferred from LMA data. Set the polarity and run the code below to start the lasso. On the plot, left click to draw the lasso, and then right click to close the lasso and assign the charge.

If you're using an HDF5-format LMA data file, the analyzed charge is automatically written to the HDF5 file. The results of the operation can be queried by looking for the points that have had their charge set to the value defined above.


In [21]:
chg = d.data['charge']
wh = np.where(chg > 0)
print d.data[wh]['time']


[ 12611.22600217  12611.3120399   12611.30892055 ...,  13193.44423766
  13193.44017365  13193.45137127]

Color by...

In addition to coloring the scatter plots by time, it's possible to use other values in the LMA data array.


In [11]:
# A reference to the current data in the view is cached by the charge lasso.
current_data = charge_lasso.cache_segment.cache[-1]
# Manually set the color limits on the flash_id variable
scatter_ctrl.default_color_bounds.flash_id =(current_data['flash_id'].min(), current_data['flash_id'].max())
# Color by flash ID.
scatter_ctrl.color_field = 'flash_id'

redraw()

Flash statistics

If the LMA controller found flash data, then it's possible to get a live update of flashes in the current view. current_events_flashes is an analysis pipeline branchpoint, which will send events and flashes to another analysis pipeline segment that can be specified with current_events_flashes.targets.add(target). Behind the scenes, it's hooked up to an segment that receives the events and flashes, and prints the average flash area of all flashes that have more than a threshold number of points.

Change the view a few times and you'll see updated flash stats below.


In [11]:
current_events_flashes = lma_ctrl.flash_stats_for_dataset(d, scatter_ctrl.branchpoint)

In [7]:

Flash volume for current sources


In [8]:
from scipy.spatial import Delaunay
from scipy.misc import factorial

from stormdrain.pipeline import coroutine
class LMAEventStats(object):
    
    def __init__(self, GeoSys):
        """ GeoSys is an instance of
            stormdrain.support.coords.systems.GeographicSystem instance
        """
        self.GeoSys = GeoSys
    
    def ECEF_coords(self, lon, lat, alt):
        x,y,z = self.GeoSys.toECEF(lon, lat, alt)
        return x,y,z
    
    
    def _hull_volume(self):
        tri = Delaunay(self.xyzt[:,0:3])
        vertices = tri.points[tri.vertices]
        
        # This is the volume formula in 
        # https://github.com/scipy/scipy/blob/master/scipy/spatial/tests/test_qhull.py#L106
        # Except the formula needs to be divided by ndim! to get the volume, cf., 
        # http://en.wikipedia.org/wiki/Simplex#Geometric_properties
        # Credit Pauli Virtanen, Oct 14, 2012, scipy-user list
        q = vertices[:,:-1,:] - vertices[:,-1,None,:]
        simplex_volumes = (1.0 / factorial(q.shape[-1])) * np.fromiter(
                (np.linalg.det(q[k,:,:]) for k in range(tri.nsimplex)) , dtype=float)
        self.tri = tri
        
        # The simplex volumes have negative values since they are oriented 
        # (think surface normal direction for a triangle
        self.volume=np.sum(np.abs(simplex_volumes))
        
    
    @coroutine
    def events_flashes_receiver(self):
        while True:
            evs, fls = (yield)
            x,y,z = self.ECEF_coords(evs['lon'], evs['lat'], evs['alt'])
            t = evs['time']
            self.xyzt = np.vstack((x,y,z,t)).T
            self._hull_volume()
            print "Volume of hull of points in current view is {0:5.1f}".format(
                        self.volume / 1.0e9) # (1000 m)^3

In [8]:
stats = LMAEventStats(panels.cs.geoProj)
stat_maker = stats.events_flashes_receiver()


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-55d009cce0e5> in <module>()
----> 1 stats = LMAEventStats(panels.cs.geoProj)
      2 stat_maker = stats.events_flashes_receiver()

NameError: name 'LMAEventStats' is not defined
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2

In [9]:
current_events_flashes.targets.add(stat_maker)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-d23ded327a4a> in <module>()
----> 1 current_events_flashes.targets.add(stat_maker)

NameError: name 'stat_maker' is not defined

In [10]:
print current_events_flashes.targets


set([<generator object flash_stat_printer at 0x10d7a50a0>])
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2
0 of 0 flashes have > 10 points. Their average area =   nan km^2

What if we take the points on the volumetric hull, and PCA decompose those? Will the envelope have a major axis that is more aligned along the plate area?

Further analysis

If your analysis is hard to explain, maybe it would help to include an equation with LaTeX: $$y=x^2$$


In [ ]:


In [ ]: