Step 1: Notebook Setup

The cell below contains a number of helper functions used throughout this walkthrough. They are mainly wrappers around existing matplotlib functionality and are provided for the sake of simplicity in the steps to come.

Take a moment to read the descriptions for each method so you understand what they can be used for. You will use these "helper methods" as you work through this notebook below.

If you are familiar with matplotlib, feel free to alter the functions as you please.

TODOs

  1. Click in the cell below and run the cell.

In [ ]:
# TODO: Make sure you run this cell before continuing!

%matplotlib inline
import matplotlib.pyplot as plt

def show_plot(x_data, y_data, x_label, y_label):
    """
    Display a simple line plot.
    
    :param x_data: Numpy array containing data for the X axis
    :param y_data: Numpy array containing data for the Y axis
    :param x_label: Label applied to X axis
    :param y_label: Label applied to Y axis
    """
    plt.figure(figsize=(10,5), dpi=100)
    plt.plot(x_data, y_data, 'b-', marker='|', markersize=2.0, mfc='b')
    plt.grid(b=True, which='major', color='k', linestyle='-')
    plt.xlabel(x_label)
    plt.ylabel (y_label)
    plt.show()
    
def plot_box(bbox):
    """
    Display a Green bounding box on an image of the blue marble.
    
    :param bbox: Shapely Polygon that defines the bounding box to display
    """
    min_lon, min_lat, max_lon, max_lat = bbox.bounds
    import matplotlib.pyplot as plt1
    from matplotlib.patches import Polygon
    from mpl_toolkits.basemap import Basemap

    map = Basemap()
    map.bluemarble(scale=0.5)
    poly = Polygon([(min_lon,min_lat),(min_lon,max_lat),(max_lon,max_lat),(max_lon,min_lat)],facecolor=(0,0,0,0.0),edgecolor='green',linewidth=2)
    plt1.gca().add_patch(poly)
    plt1.gcf().set_size_inches(10,15)
    
    plt1.show()
    
def show_plot_two_series(x_data_a, x_data_b, y_data_a, y_data_b, x_label, y_label_a, y_label_b, series_a_label, series_b_label):
    """
    Display a line plot of two series
    
    :param x_data_a: Numpy array containing data for the Series A X axis
    :param x_data_b: Numpy array containing data for the Series B X axis
    :param y_data_a: Numpy array containing data for the Series A Y axis
    :param y_data_b: Numpy array containing data for the Series B Y axis
    :param x_label: Label applied to X axis
    :param y_label_a: Label applied to Y axis for Series A
    :param y_label_b: Label applied to Y axis for Series B
    :param series_a_label: Name of Series A
    :param series_b_label: Name of Series B
    """
    fig, ax1 = plt.subplots(figsize=(10,5), dpi=100)
    series_a, = ax1.plot(x_data_a, y_data_a, 'b-', marker='|', markersize=2.0, mfc='b', label=series_a_label)
    ax1.set_ylabel(y_label_a, color='b')
    ax1.tick_params('y', colors='b')
    ax1.set_ylim(min(0, *y_data_a), max(y_data_a)+.1*max(y_data_a))
    ax1.set_xlabel(x_label)
    
    ax2 = ax1.twinx()
    series_b, = ax2.plot(x_data_b, y_data_b, 'r-', marker='|', markersize=2.0, mfc='r', label=series_b_label)
    ax2.set_ylabel(y_label_b, color='r')
    ax2.set_ylim(min(0, *y_data_b), max(y_data_b)+.1*max(y_data_b))
    ax2.tick_params('y', colors='r')
    
    plt.grid(b=True, which='major', color='k', linestyle='-')
    plt.legend(handles=(series_a, series_b), bbox_to_anchor=(1.1, 1), loc=2, borderaxespad=0.)
    plt.show()

Step 2: List available Datasets

Now we can interact with NEXUS using the nexuscli python module. The nexuscli module has a number of useful methods that allow you to easily interact with the NEXUS webservice API. One of those methods is nexuscli.dataset_list which returns a list of Datasets in the system along with their start and end times.

However, in order to use the client, it must be told where the NEXUS webservice is running. The nexuscli.set_target(url) method is used to target NEXUS. An instance of NEXUS is already running for you and is available at http://nexus-webapp:8083.

TODOs

  1. Import the nexuscli python module.
  2. Call nexuscli.dataset_list() and print the results

In [ ]:
# TODO: Import the nexuscli python module.


# Target the nexus webapp server
nexuscli.set_target("http://nexus-webapp:8083")

# TODO: Call nexuscli.dataset_list() and print the results

Step 3: Run a Time Series

Now that we can interact with NEXUS using the nexuscli python module, we would like to run a time series. To do this, we will use the nexuscli.time_series method. The signature for this method is described below:

nexuscli.time_series(datasets, bounding_box, start_datetime, end_datetime, spark=False)
Send a request to NEXUS to calculate a time series.
datasets Sequence (max length 2) of the name of the dataset(s)
bounding_box Bounding box for area of interest as a shapely.geometry.polygon.Polygon
start_datetime Start time as a datetime.datetime
end_datetime End time as a datetime.datetime
spark Optionally use spark. Default: False return List of nexuscli.nexuscli.TimeSeries namedtuples ```

As you can see, there are a number of options available. Let's try investigating The Blob in the Pacific Ocean. The Blob is an abnormal warming of the Sea Surface Temperature that was first observed in 2013.

Generate a time series for the AVHRR_OI_L4_GHRSST_NCEI SST dataset for the time period 2013-01-01 through 2014-03-01 and a bounding box -150, 40, -120, 55 (west, south, east, north).

TODOs

  1. Create the bounding box using shapely's box method
  2. Plot the bounding box using the plot_box helper method
  3. Generate the Time Series by calling the time_series method in the nexuscli module
    • Hint: datetime is already imported for you. You can create a datetime using the method datetime(int: year, int: month, int: day)
    • Hint: pass spark=True to the time_series function to speed up the computation
  4. Plot the result using the show_plot helper method

In [ ]:
import time
import nexuscli
from datetime import datetime

from shapely.geometry import box

# TODO: Create a bounding box using the box method imported above

# TODO: Plot the bounding box using the helper method plot_box

In [ ]:
# Do not modify this line ##
start = time.perf_counter()#
############################


# TODO: Call the time_series method for the AVHRR_OI_L4_GHRSST_NCEI dataset using 
# your bounding box and time period 2013-01-01 through 2014-03-01


# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))

In [ ]:
# TODO: Plot the result using the `show_plot` helper method

Step 3a: Run for a Longer Time Period

Now that you have successfully generated a time series for approximately one year of data. Try generating a longer time series by increasing the end date to 2016-12-31. This will take a little bit longer to execute, since there is more data to analyze, but should finish in under a minute.

The significant increase in sea surface temperature due to the blob should be visible as an upward trend between 2013 and 2015 in this longer time series.

TODOs

  1. Generate a longer time series from 2013-01-01 to 2016-12-31
  2. Plot the result using the show_plot helper method. Make sure you pass spark=True to the time_series function to speed up the analysis

Advanced (Optional)

  1. For an extra challenge, try plotting the trend line.
    • Hint numpy and scipy packages are installed and can be used by importing them: import numpy or import scipy
    • Hint You will need to convert the TimeSeries.time array to numbers in order to generate a polynomial fit line. matplotlib has a built in function capable of doing this: matplotlib.dates.date2num and it's inverse matplotlib.dates.num2date

In [ ]:
import time
import nexuscli
from datetime import datetime

from shapely.geometry import box

bbox = box(-150, 40, -120, 55)
plot_box(bbox)

# Do not modify this line ##
start = time.perf_counter()#
############################

# TODO: Call the time_series method for the AVHRR_OI_L4_GHRSST_NCEI dataset using 
# your bounding box and time period 2013-01-01 through 2016-12-31
# Make sure you pass spark=True to the time_series function to speed up the analysis


# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))

In [ ]:
# TODO: Plot the result using the `show_plot` helper method

Step 4: Run two Time Series' and plot them side-by-side

The time_series method can be used on up to two datasets at one time for comparison. Let's take a look at another region and see how to generate two time series and plot them side by side.

Hurricane Katrina passed to the southwest of Florida on Aug 27, 2005. The ocean response in a 1 x 1 degree region is captured by a number of satellites. The initial ocean response was an immediate cooling of the surface waters by 2 degrees Celcius that lingers for several days. The SST drop is correlated to both wind and precipitation data.

A study of a Hurricane Katrina–induced phytoplankton bloom using satellite observations and model simulations Xiaoming Liu, Menghua Wang, and Wei Shi1 JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 114, C03023, doi:10.1029/2008JC004934, 2009 http://shoni2.princeton.edu/ftp/lyo/journals/Ocean/phybiogeochem/Liu-etal-KatrinaChlBloom-JGR2009.pdf

Plot the time series for the AVHRR_OI_L4_GHRSST_NCEI SST dataset and the TRMM_3B42_daily Precipitation dataset for the region -84.5, 23.5, -83.5, 24.5 and time frame of 2005-08-24 through 2005-09-10. Plot the result using the show_plot_two_series helper method and see if you can recognize the correlation between the spike in precipitation and the decrease in temperature.

TODOs

  1. Create a bounding box for the region in the Gulf of Mexico that Hurricane Katrina passed through (-84.5, 23.5, -83.5, 24.5)
  2. Plot the bounding box using the helper method plot_box
  3. Generate the Time Series by calling the time_series method in the nexuscli module
  4. Plot the result using the show_plot_two_series helper method

In [ ]:
import time
import nexuscli
from datetime import datetime

from shapely.geometry import box

# TODO: Create a bounding box using the box method imported above


# TODO: Plot the bounding box using the helper method plot_box

In [ ]:
# Do not modify this line ##
start = time.perf_counter()#
############################

# TODO: Call the time_series method for the AVHRR_OI_L4_GHRSST_NCEI dataset and the `TRMM_3B42_daily` dataset
# using your bounding box and time period 2005-08-24 through 2005-09-10




# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))

In [ ]:
# TODO: Plot the result using the `show_plot_two_series` helper method

Step 5: Run a Daily Difference Average (Anomaly) calculation

Let's return to The Blob region. But this time we're going to use a different calculation, Daily Difference Average (aka. Anomaly plot).

The Daily Difference Average algorithm compares a dataset against a climatological mean and produces a time series of the difference from that mean. Given The Blob region, we should expect to see a positive difference from the mean temperature in that region (indicating higher temperatures than normal) between 2013 and 2014.

This time, using the nexuscli module, call the daily_difference_average method. The signature for that method is reprinted below:

Generate an anomaly Time series for a given dataset, bounding box, and timeframe.
dataset Name of the dataset as a String
bounding_box Bounding box for area of interest as a shapely.geometry.polygon.Polygon
start_datetime Start time as a datetime.datetime
end_datetime End time as a datetime.datetime

return List of nexuscli.nexuscli.TimeSeries namedtuples

Generate an anomaly time series using the AVHRR_OI_L4_GHRSST_NCEI SST dataset for the time period 2013-01-01 through 2016-12-31 and a bounding box -150, 40, -120, 55 (west, south, east, north).

TODOs

  1. Generate the Anomaly Time Series by calling the daily_difference_average method in the nexuscli module
  2. Plot the result using the show_plot helper method

Advanced (Optional)

  1. Generate an Anomaly Time Series for the El Niño 3.4 region (bounding box -170, -5, -120, 5) from 2010 to 2015.

In [ ]:
import time
import nexuscli
from datetime import datetime

from shapely.geometry import box

bbox = box(-150, 40, -120, 55)
plot_box(bbox)

# Do not modify this line ##
start = time.perf_counter()#
############################


# TODO: Call the daily_difference_average method for the AVHRR_OI_L4_GHRSST_NCEI dataset using 
# your bounding box and time period 2013-01-01 through 2016-12-31. Be sure to pass spark=True as a parameter
# to speed up processing.






# Enter your code above this line
print("Daily Difference Average took {} seconds to generate".format(time.perf_counter() - start))

In [ ]:
# TODO: Plot the result using the `show_plot` helper method

Congratulations!

You have finished this workbook.

If others are still working, please feel free to modify the examples and play with the client module or go back and complete the "Advanced" challenges if you skipped them. Further technical information about NEXUS can be found in the GitHub repository.

If you would like to save this notebook for reference later, click on File -> Download as... and choose your preferred format.