See environment_setup.README (below) for instructions about the use of the DC3_plots_NALMA script. It is a version of the script used to process the DC3 dataset as in Barth et al. (2015, BAMS) and Bruning and Thomas (2015, JGR).

The flash sorting infrastructure is modular. This script uses the DBSCAN algorithm as implemented in the scikit-learn machine-learning library. In order to manage the $N^2$ efficiency of the underlying DBSCAN implementation, data are clustered in pairs of thresh_duration chunks.

The script is configurable in a few places.

  • base_sort_dir sets the path where
  • center_ID chooses a network center. The centers are defined in the centers dictionary. The ID is used later when constructing output filenames, too.
  • The params dictionary configures the flash sorting algorithm. Of particular importance are the following.
    • stations: sets the (min, max) number of stations that must participate in each solution for it to count. Max should be larger than the number of stations. Min should be six or seven, depending on the number of stations.
    • chi2: sets the (min, max) chi-squared value. The minimum should be zero, while a good maximum to start with is 1.0.
    • distance: maximum distance between a source and its closest neighbor before a new flash is started
    • thresh_critical_time: maximum temporal separation between a source and its closest neighbor before a new flash is started
    • thresh_duration: All flashes should be last less than or equal to this number of seconds. All flashes of duration < thresh_duration are guaranteed to remain clustered. An occasional lucky flash of duration = 2 * thresh_duration is possible.

The script is broken into three sections.

  • Run the flash sorting, which creates HDF5 data files with VHF source data, their flash IDs, and a matching flash data table.
  • Grab the flash-sorted files and create CF-compliant NetCDF grids
  • Grab the grids and create PDF images of each grid

The grid spacing, boundaries, and frame intervals are configured at the begining of the gridding section of the script. This script creates regularly-spaced lat/lon grids, with the center grid cell size calculated to match the specified dx_km and dy_km. It is also possible to grid directly in a map projection of choice by changing proj_name, as well as x_name and y_name in the call to make_plot. For instance, a geostationary projection can be obtained with proj='geos' as described in the documentation for the proj4 coordinate system library.

The PDF images are created as small-multiple plots, with the number of columns given by n_cols at the beginning of the plotting section.

An example of reading and working with the resulting data files is found in the "Reading the flash-sorted files.ipynb"

As described below, additional scripts perform follow-on analysis.

  • Assigning NLDN strokes to the best-matching flash
  • Using a storm cell or storm region polygon to subset some flashes from the data files.
    • Creating time series plots of moments of the flash size distribution
    • Creating ASCII files of flash size and rate statistics

The IOP bounding box file included here is a rectangular lat/lon box, but the underlying code works with arbitrary polygons. Adapting the existing code to polygons is mostly a matter of reading in polygon vertices and sending its vertices instead of those for a rectangle.


In [3]:
%%bash

cat /data/GLM-wkshp/flashsort/environment_setup.README


8 September 2015 
Eric Bruning eric.bruning@ttu.edu

This document provides details about running a flash-sorting analysis, including
producing the flash time series statistics. These scripts are largely those used
for processing the DC3 dataset at TTU, and are suitable for (re)processing a large number of cases.

Python setup
------------

This analysis is run using the Anaconda Python distribution, with the primary
needs being numpy, scipy, matplotlib, pupynere, and pyproj.

After installing anaconda to your home directory, go to your home directory,
without anaconda already in $PATH

./anaconda/bin/conda create -n LMA --clone root

cd anaconda/envs/LMA/ source ./anaconda/bin/activate LMA

This sets up an environment with a python environment with just the necessary pieces for this LMA analysis.

then pip install git+http://github.com/deeplycloudy/lmatools
git+http://github.com/deeplycloudy/stormdrain

If you have to pause and return later then simply cd home and

source ./anaconda/bin/activate LMA

Running the analysis
--------------------

Finally, change to the directory where you have the analysis scripts. It should
contain 'figures-length' and 'results' directories into which output will be placed.

First sort flashes to HDF5 files and create 1 km, 1 min grids:

python DC3_plots_NALMA.py /data/GLM-wkshp/20090410/LMA_files/*.dat.gz

You will need to edit the path "base_sort_dir" near the top of the script.

To add in NLDN data, run, for each H5 file,

python NLDN_matching.py ../20090410/NLDN_files/Nstroke20090410_daily_v1_lit.raw  results/h5_files/2009/Apr/10/LYLOUT_090410_160000_3600.dat.flash.h5

Run the following, and capture the output to output.txt to get the flash size
time series stats, along with other run info 

You will need to edit the "path_to_sort_results" to point to the 'results' directory.

python DC3-IOP-hires-stats.py IOPsupercell18-AL-20090410-boundingbox.txt > figures-length/IOPsupercell18-output.txt

The energy-stats.py script is used by DC3-IOP-stats.py; you shouldn't run it
directly.

harvest_flash_timeseries.py will produce a CSV file of flash timeseries data from
the captured output of the main script. 

python harvest_flash_timeseries.py figures-length/IOPsupercell18-output.txt 

make_movies.py can be used with the ffmpeg utility to stitch together the grid images into movies.

In [12]:
# Links to representative PDFs.
from IPython.display import display, HTML, Image
class PDF(object):
    def __init__(self, filename):
        self.filename = filename

    def _repr_pdf_(self):
        return open(self.filename, 'rb').read()

In [16]:
base_path = '/data/GLM-wkshp/flashsort/figures-length/IOPsupercell18-AL-20090410-boundingbox-thresh-0.15_dist-3000.0_pts-10/'

Case-specific time series data

Locations of all flashes contributing to the analyses

Lets you identify that only one cell was tracked


In [18]:
Image(base_path+'flashes.png')


Out[18]:

Time-height plot, separated by IC and CG.

There are accompanying ASCII data files with the raw data. Uses the technique in Bruning and Thomas (2015, JGR)


In [10]:
PDF(base_path+'D-1.7_b-0.25_length-profiles_CG.pdf')


Out[10]:
<__main__.PDF at 0x106d2ed90>

In [11]:
PDF(base_path+'D-1.7_b-0.25_length-profiles_IC.pdf')


Out[11]:
<__main__.PDF at 0x106d3da90>

Time series of flash moments

There are accompanying ASCII data files with the raw data. Uses the technique in Bruning and Thomas (2015, JGR).


In [24]:
PDF(base_path+'moment-energy-timeseries.pdf')


Out[24]:
<__main__.PDF at 0x1057a6190>

In [25]:
import pandas as pd
pd.read_csv(base_path+'../IOPsupercell18-output.flash_stats.csv')


Out[25]:
start_isoformat end_isoformat number mean variance skewness kurtosis energy energy/number
0 2009-04-10T18:00:00 2009-04-10T18:01:00 19 5.778776 19.602975 0.847279 -0.977476 1006.947327 52.997228
1 2009-04-10T18:01:00 2009-04-10T18:02:00 21 3.732302 4.780171 1.495719 1.565106 392.915283 18.710252
2 2009-04-10T18:02:00 2009-04-10T18:03:00 26 4.070445 5.901269 2.002315 3.638201 584.214661 22.469795
3 2009-04-10T18:03:00 2009-04-10T18:04:00 25 3.737550 8.718759 2.769132 8.407045 567.200989 22.688040
4 2009-04-10T18:04:00 2009-04-10T18:05:00 22 4.383382 12.730396 2.383367 5.815393 702.777527 31.944433
5 2009-04-10T18:05:00 2009-04-10T18:06:00 19 4.886418 8.262498 1.022592 -0.128251 610.651917 32.139575
6 2009-04-10T18:06:00 2009-04-10T18:07:00 19 5.011994 12.603062 2.047743 5.020269 716.739807 37.723148
7 2009-04-10T18:07:00 2009-04-10T18:08:00 15 4.261480 2.880228 0.124338 -0.928883 315.606598 21.040440
8 2009-04-10T18:08:00 2009-04-10T18:09:00 28 4.725330 11.574020 1.693249 2.865010 949.277344 33.902762
9 2009-04-10T18:09:00 2009-04-10T18:10:00 26 4.236659 6.982160 1.024067 -0.114266 648.217529 24.931443
10 2009-04-10T18:10:00 2009-04-10T18:11:00 28 4.038956 13.608267 2.507203 6.034024 837.800171 29.921435
11 2009-04-10T18:11:00 2009-04-10T18:12:00 28 4.131995 12.547647 3.010819 10.314295 829.388916 29.621033
12 2009-04-10T18:12:00 2009-04-10T18:13:00 21 5.146380 17.641989 2.240353 5.478564 926.671631 44.127221
13 2009-04-10T18:13:00 2009-04-10T18:14:00 36 4.041950 15.148875 3.017169 9.522197 1133.504395 31.486233
14 2009-04-10T18:14:00 2009-04-10T18:15:00 33 3.932558 8.712317 1.726340 2.597758 797.851868 24.177329
15 2009-04-10T18:15:00 2009-04-10T18:16:00 34 5.013666 22.268609 2.104787 3.598519 1611.785522 47.405457
16 2009-04-10T18:16:00 2009-04-10T18:17:00 32 3.851355 10.630044 2.326626 6.063691 814.815369 25.462980
17 2009-04-10T18:17:00 2009-04-10T18:18:00 39 4.602452 13.925999 1.845624 3.011223 1369.233887 35.108561
18 2009-04-10T18:18:00 2009-04-10T18:19:00 44 3.846760 12.046536 3.815344 17.847013 1181.140503 26.844102
19 2009-04-10T18:19:00 2009-04-10T18:20:00 51 4.277573 12.603373 1.928203 3.054272 1575.951050 30.901001
20 2009-04-10T18:20:00 2009-04-10T18:21:00 46 4.690286 10.250978 1.513517 1.909096 1483.489014 32.249761
21 2009-04-10T18:21:00 2009-04-10T18:22:00 42 4.568916 15.905318 2.291886 5.866569 1544.773071 36.780311
22 2009-04-10T18:22:00 2009-04-10T18:23:00 45 3.634261 11.907280 2.656233 7.636967 1130.180908 25.115131
23 2009-04-10T18:23:00 2009-04-10T18:24:00 42 3.642177 8.435968 2.180650 4.931750 911.459595 21.701419
24 2009-04-10T18:24:00 2009-04-10T18:25:00 31 5.012799 18.930034 1.640178 1.677502 1365.803711 44.058184
25 2009-04-10T18:25:00 2009-04-10T18:26:00 36 4.290002 15.950208 2.044512 3.932517 1236.755615 34.354323
26 2009-04-10T18:26:00 2009-04-10T18:27:00 50 3.774712 11.242299 2.132787 4.651355 1274.537354 25.490747
27 2009-04-10T18:27:00 2009-04-10T18:28:00 51 4.105699 18.869615 2.748570 9.173155 1822.045166 35.726376
28 2009-04-10T18:28:00 2009-04-10T18:29:00 49 4.077870 17.021636 2.395364 5.255782 1648.882324 33.650660
29 2009-04-10T18:29:00 2009-04-10T18:30:00 48 3.756010 9.326992 1.933377 3.911453 1124.860840 23.434601
30 2009-04-10T18:30:00 2009-04-10T18:31:00 48 3.929396 13.741356 2.894733 9.864712 1400.712402 29.181508
31 2009-04-10T18:31:00 2009-04-10T18:32:00 61 3.675300 8.843729 2.602790 8.222217 1363.445068 22.351558
32 2009-04-10T18:32:00 2009-04-10T18:33:00 50 3.120657 5.313263 3.471133 14.914728 752.588074 15.051761
33 2009-04-10T18:33:00 2009-04-10T18:34:00 51 4.077593 14.677958 2.047971 3.502394 1596.540894 31.304723
34 2009-04-10T18:34:00 2009-04-10T18:35:00 40 4.640823 12.651113 1.481433 1.421187 1367.534180 34.188354
35 2009-04-10T18:35:00 2009-04-10T18:36:00 51 4.886946 12.803767 1.300132 1.407060 1870.986206 36.686004
36 2009-04-10T18:36:00 2009-04-10T18:37:00 61 5.009959 13.559101 1.175996 0.507141 2358.186035 38.658787
37 2009-04-10T18:37:00 2009-04-10T18:38:00 48 5.058263 15.166728 1.353694 1.741285 1956.132324 40.752757
38 2009-04-10T18:38:00 2009-04-10T18:39:00 58 4.979229 30.683631 2.756521 9.678248 3217.628174 55.476348
39 2009-04-10T18:39:00 2009-04-10T18:40:00 52 5.654303 16.479805 0.972142 0.343621 2519.449463 48.450951
40 2009-04-10T18:40:00 2009-04-10T18:41:00 56 5.952006 11.879012 1.066857 0.784598 2649.101807 47.305389
41 2009-04-10T18:41:00 2009-04-10T18:42:00 46 5.881254 15.678550 1.249828 1.986071 2312.314209 50.267700
42 2009-04-10T18:42:00 2009-04-10T18:43:00 54 5.352245 10.945761 1.531569 2.468816 2137.983643 39.592290
43 2009-04-10T18:43:00 2009-04-10T18:44:00 66 5.260994 23.117295 2.652185 9.557940 3352.493408 50.795355
44 2009-04-10T18:44:00 2009-04-10T18:45:00 57 5.323646 15.138295 1.194806 0.790870 2478.331543 43.479501
45 2009-04-10T18:45:00 2009-04-10T18:46:00 67 5.316706 12.744240 1.345636 2.210861 2747.777344 41.011602
46 2009-04-10T18:46:00 2009-04-10T18:47:00 69 5.565363 14.887528 1.371537 2.305345 3164.395020 45.860797
47 2009-04-10T18:47:00 2009-04-10T18:48:00 61 5.583078 21.691541 1.143122 0.425785 3224.600586 52.862305
48 2009-04-10T18:48:00 2009-04-10T18:49:00 68 5.952306 30.067031 2.971625 13.383453 4453.794434 65.496977
49 2009-04-10T18:49:00 2009-04-10T18:50:00 64 5.911775 17.574892 1.024505 0.383359 3361.534180 52.523972
50 2009-04-10T18:50:00 2009-04-10T18:51:00 59 5.773176 21.951206 2.179734 7.810302 3261.565430 55.280770
51 2009-04-10T18:51:00 2009-04-10T18:52:00 63 5.351227 14.970364 1.239599 1.564534 2747.177490 43.605992
52 2009-04-10T18:52:00 2009-04-10T18:53:00 70 5.484482 20.095076 1.272950 1.684794 3512.223145 50.174616
53 2009-04-10T18:53:00 2009-04-10T18:54:00 78 4.678094 14.966130 1.537602 2.260870 2874.353760 36.850689
54 2009-04-10T18:54:00 2009-04-10T18:55:00 107 4.003260 14.776357 2.111797 4.827927 3295.861816 30.802447
55 2009-04-10T18:55:00 2009-04-10T18:56:00 119 4.048839 12.981428 2.262253 6.949541 3495.568115 29.374522
56 2009-04-10T18:56:00 2009-04-10T18:57:00 126 3.819293 13.559522 2.355536 7.032421 3546.461426 28.146519
57 2009-04-10T18:57:00 2009-04-10T18:58:00 146 3.520225 10.412113 2.413398 7.635967 3329.398438 22.804099
58 2009-04-10T18:58:00 2009-04-10T18:59:00 163 3.094247 9.591151 2.621128 8.811767 3123.978516 19.165512
59 2009-04-10T18:59:00 2009-04-10T19:00:00 156 3.273624 9.075291 2.918384 13.095815 3087.537109 19.791905

Plots of each minute in the gridded data

LMA source density, flash extent density, flash initation density, and average flash area.


In [23]:
Image(base_path+'/grids_lma_source/lma_source_20090410_185300.png')


Out[23]:

In [20]:
Image(base_path+'/grids_flash_extent/flash_extent_20090410_185300.png')


Out[20]:

In [21]:
Image(base_path+'/grids_flash_initiation/flash_initiation_20090410_185300.png')


Out[21]:

In [22]:
Image(base_path+'/grids_flash_footprint/flash_footprint_20090410_185300.png')


Out[22]:

Each of the grid type folders contains a CSV file with statistics of the pixels making up the image.


In [29]:
pd.read_csv(base_path+'/grids_flash_extent/flash_extent_20090410.csv')


Out[29]:
time (ISO) max count per grid box sum of all grid boxes 5th percentile 50th percentile 95th percentile
0 2009-04-10T18:00:00 5 641 1 1 3.0
1 2009-04-10T18:01:00 5 322 1 1 3.0
2 2009-04-10T18:02:00 7 513 1 1 4.0
3 2009-04-10T18:03:00 7 457 1 1 4.0
4 2009-04-10T18:04:00 5 580 1 1 3.0
5 2009-04-10T18:05:00 5 460 1 1 3.0
6 2009-04-10T18:06:00 5 430 1 1 3.0
7 2009-04-10T18:07:00 5 302 1 1 3.0
8 2009-04-10T18:08:00 7 631 1 1 4.0
9 2009-04-10T18:09:00 8 488 1 1 4.0
10 2009-04-10T18:10:00 9 616 1 1 5.0
11 2009-04-10T18:11:00 8 660 1 1 4.0
12 2009-04-10T18:12:00 7 643 1 1 3.0
13 2009-04-10T18:13:00 8 815 1 1 5.0
14 2009-04-10T18:14:00 8 637 1 1 5.0
15 2009-04-10T18:15:00 9 1071 1 1 4.7
16 2009-04-10T18:16:00 9 641 1 1 5.0
17 2009-04-10T18:17:00 8 1045 1 1 5.0
18 2009-04-10T18:18:00 10 938 1 1 6.0
19 2009-04-10T18:19:00 12 1223 1 1 6.0
20 2009-04-10T18:20:00 10 1083 1 1 6.0
21 2009-04-10T18:21:00 9 1230 1 1 5.0
22 2009-04-10T18:22:00 10 924 1 1 5.0
23 2009-04-10T18:23:00 10 799 1 1 7.0
24 2009-04-10T18:24:00 7 946 1 1 4.0
25 2009-04-10T18:25:00 9 953 1 1 4.7
26 2009-04-10T18:26:00 12 1004 1 1 8.0
27 2009-04-10T18:27:00 19 1161 1 1 8.0
28 2009-04-10T18:28:00 12 1102 1 1 5.0
29 2009-04-10T18:29:00 16 948 1 2 9.0
... ... ... ... ... ... ...
90 2009-04-10 19:30:00 0 0 0 0 0.0
91 2009-04-10 19:31:00 0 0 0 0 0.0
92 2009-04-10 19:32:00 0 0 0 0 0.0
93 2009-04-10 19:33:00 0 0 0 0 0.0
94 2009-04-10 19:34:00 0 0 0 0 0.0
95 2009-04-10 19:35:00 0 0 0 0 0.0
96 2009-04-10 19:36:00 0 0 0 0 0.0
97 2009-04-10 19:37:00 0 0 0 0 0.0
98 2009-04-10 19:38:00 0 0 0 0 0.0
99 2009-04-10 19:39:00 0 0 0 0 0.0
100 2009-04-10 19:40:00 0 0 0 0 0.0
101 2009-04-10 19:41:00 0 0 0 0 0.0
102 2009-04-10 19:42:00 0 0 0 0 0.0
103 2009-04-10 19:43:00 0 0 0 0 0.0
104 2009-04-10 19:44:00 0 0 0 0 0.0
105 2009-04-10 19:45:00 0 0 0 0 0.0
106 2009-04-10 19:46:00 0 0 0 0 0.0
107 2009-04-10 19:47:00 0 0 0 0 0.0
108 2009-04-10 19:48:00 0 0 0 0 0.0
109 2009-04-10 19:49:00 0 0 0 0 0.0
110 2009-04-10 19:50:00 0 0 0 0 0.0
111 2009-04-10 19:51:00 0 0 0 0 0.0
112 2009-04-10 19:52:00 0 0 0 0 0.0
113 2009-04-10 19:53:00 0 0 0 0 0.0
114 2009-04-10 19:54:00 0 0 0 0 0.0
115 2009-04-10 19:55:00 0 0 0 0 0.0
116 2009-04-10 19:56:00 0 0 0 0 0.0
117 2009-04-10 19:57:00 0 0 0 0 0.0
118 2009-04-10 19:58:00 0 0 0 0 0.0
119 2009-04-10 19:59:00 0 0 0 0 0.0

120 rows × 6 columns

Flash energy spectra

Plots of the flash energy spectra as defined in Bruning and MacGorman (2013, JAS). A 5/3 power law reference line is plotted.


In [32]:
PDF(base_path+'LYLOUT_090410_180000_3600-energy.pdf')


Out[32]:
<__main__.PDF at 0x109b4a150>

In [ ]: