Create NERC/National figures

This notebook is where all national/NERC figures (not maps) are created. It contains code to import generation and CO₂ intensity results and uses some functions from scr to create figures.

Instructions

The first part of this notebook imports and combines data. Subsequent sections generate different types of plots. Summary statistics of CO₂ intensity are available at the bottom of this notebook.


In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import matplotlib.lines as mlines
from matplotlib.ticker import StrMethodFormatter
import pandas as pd
import seaborn as sns
import numpy as np
import os
from os.path import join
from statsmodels.tsa.tsatools import detrend
# import statsmodels.formula.api as sm
import itertools
import sys
import calendar
import datetime
# sns.set()
idx = pd.IndexSlice

try:
    from pathlib import Path
except:
    pass

Date string for filenames

This will be inserted into all filenames (reading and writing)


In [2]:
file_date = '2018-03-06'

In [3]:
# Load the "autoreload" extension
%load_ext autoreload

# always reload modules marked with "%aimport"
%autoreload 1

In [4]:
# add the 'src' directory as one where we can import modules
src_dir = join(os.getcwd(), os.pardir, 'src')
sys.path.append(src_dir)

In [5]:
%aimport Analysis.index
from Analysis.index import facility_co2, adjust_epa_emissions, group_fuel_cats
from Analysis.index import reduce_emission_factors, add_datetime
%aimport util.utils
from util.utils import rename_cols

%aimport Plots.plot
from Plots.plot import region_facet_grid, rolling_corr_plot, monthly_fuel_gen

Load CO₂ intensity and percent gen data

CO₂ intensity (aka index)


In [6]:
cwd = os.getcwd()
path = join(cwd, '..', 'Data storage', 'National data',
            'National index {}.csv'.format(file_date))
national_index = pd.read_csv(path, parse_dates=['datetime'])
national_index['nerc'] = 'USA'
rename_cols(national_index)

In [7]:
national_index.head()


Out[7]:
year month generation (mwh) elec fuel co2 (kg) final co2 (kg) index (g/kwh) datetime quarter index (lb/mwh) change since 2005 nerc
0 2001 1 3.324839e+08 2.110627e+11 2.145929e+11 645.423413 2001-01-01 1 1422.900456 0.077267 USA
1 2001 2 2.829402e+08 1.762658e+11 1.799442e+11 635.979511 2001-02-01 1 1402.080431 0.061504 USA
2 2001 3 3.007065e+08 1.860660e+11 1.887242e+11 627.602684 2001-03-01 1 1383.612877 0.047523 USA
3 2001 4 2.780781e+08 1.717565e+11 1.744977e+11 627.513263 2001-04-01 2 1383.415739 0.047373 USA
4 2001 5 3.004857e+08 1.859817e+11 1.888085e+11 628.344387 2001-05-01 2 1385.248036 0.048761 USA

In [8]:
path = join(cwd, '..', 'Data storage',
            'final NERC data',
            'NERC gen emissions and index {}.csv'.format(file_date))
nerc_index = pd.read_csv(path, parse_dates=['datetime'])
nerc_index.rename(columns={'index': 'index (g/kwh)'}, inplace=True)

In [9]:
nerc_index.head()


Out[9]:
year nerc month final co2 (kg) generation (mwh) index (g/kwh) datetime
0 2001 ASCC 1 2.780216e+08 510124.027 545.007802 2001-01-01
1 2001 ASCC 2 2.694399e+08 457691.568 588.693180 2001-02-01
2 2001 ASCC 3 2.944447e+08 493876.659 596.190762 2001-03-01
3 2001 ASCC 4 2.357533e+08 417279.658 564.976768 2001-04-01
4 2001 ASCC 5 2.306594e+08 409477.108 563.302275 2001-05-01

Percent generation


In [10]:
path = join(cwd, '..', 'Data storage',
            'National data', 'National percent gen {}.csv'.format(file_date))
national_per_gen = pd.read_csv(path)
add_datetime(national_per_gen)
national_per_gen['nerc'] = 'USA'
national_per_gen.rename(columns={'generation (mwh)': '% generation'},
                        inplace=True)

In [11]:
national_per_gen.head()


Out[11]:
year month fuel category elec fuel co2 (kg) % generation datetime nerc
0 2001 1 Coal 0.0 0.533220 2001-01-01 USA
1 2001 2 Coal 0.0 0.529212 2001-02-01 USA
2 2001 3 Coal 0.0 0.516347 2001-03-01 USA
3 2001 4 Coal 0.0 0.505867 2001-04-01 USA
4 2001 5 Coal 0.0 0.504493 2001-05-01 USA

In [12]:
path = join(cwd, '..', 'Data storage',
            'final NERC data', 'NERC percent gen {}.csv'.format(file_date))
nerc_per_gen = pd.read_csv(path, parse_dates=['datetime'])

In [13]:
nerc_per_gen.head()


Out[13]:
fuel category datetime nerc % generation
0 Coal 2001-01-01 ASCC 0.051935
1 Coal 2001-02-01 ASCC 0.068307
2 Coal 2001-03-01 ASCC 0.057626
3 Coal 2001-04-01 ASCC 0.049022
4 Coal 2001-05-01 ASCC 0.058848

Absolute generation


In [14]:
path = join(cwd, '..', 'Data storage',
            'National data', 'National generation {}.csv'.format(file_date))
national_gen = pd.read_csv(path)
add_datetime(national_gen)
national_gen['nerc'] = 'USA'

In [15]:
national_gen.tail()


Out[15]:
fuel category year month generation (mwh) datetime nerc
1627 Wind 2017 8 13064719.50 2017-08-01 USA
1628 Wind 2017 9 17264878.55 2017-09-01 USA
1629 Wind 2017 10 24814868.01 2017-10-01 USA
1630 Wind 2017 11 23315592.47 2017-11-01 USA
1631 Wind 2017 12 22757913.41 2017-12-01 USA

In [16]:
index_cols = ['nerc', 'generation (mwh)', 'datetime']
nerc_gen = pd.merge(nerc_per_gen, nerc_index[index_cols],
                    on=['nerc', 'datetime'])
nerc_gen.rename(columns={'generation (mwh)' : 'total generation (mwh)'},
                inplace=True)
nerc_gen['generation (mwh)'] = (nerc_gen['% generation']
                                * nerc_gen['total generation (mwh)'])
nerc_gen.set_index(['nerc', 'fuel category', 'datetime'], inplace=True)
nerc_gen.head()


Out[16]:
% generation total generation (mwh) generation (mwh)
nerc fuel category datetime
ASCC Coal 2001-01-01 0.051935 510124.027 26493.302
Hydro 2001-01-01 0.203755 510124.027 103940.302
Natural Gas 2001-01-01 0.615942 510124.027 314206.726
Other 2001-01-01 0.128199 510124.027 65397.179
Wind 2001-01-01 0.000170 510124.027 86.518

Add generation by different fuel sources to get total generation by NERC


In [17]:
total = nerc_gen.groupby(['nerc', 'datetime'])['% generation',
                                               'generation (mwh)'].sum()
total['fuel category'] = 'Total'
total.set_index('fuel category', append=True, inplace=True)
total = total.reorder_levels(['nerc', 'fuel category', 'datetime'])
total.head()


Out[17]:
% generation generation (mwh)
nerc fuel category datetime
ASCC Total 2001-01-01 1.0 510124.027
2001-02-01 1.0 457691.568
2001-03-01 1.0 493876.659
2001-04-01 1.0 417279.658
2001-05-01 1.0 409477.108

In [18]:
nerc_gen = pd.concat([nerc_gen, total])
nerc_gen.sort_index(inplace=True)

In [19]:
nerc_gen.head()


Out[19]:
% generation generation (mwh) total generation (mwh)
nerc fuel category datetime
ASCC Coal 2001-01-01 0.051935 26493.302 510124.027
2001-02-01 0.068307 31263.704 457691.568
2001-03-01 0.057626 28459.894 493876.659
2001-04-01 0.049022 20455.924 417279.658
2001-05-01 0.058848 24096.734 409477.108

Load capacity by month, fuel, nerc


In [20]:
path = join(cwd, '..', 'Data storage', 'Derived data',
            'Plant Capacity', 'monthly capacity by fuel.csv')

capacity = pd.read_csv(path, index_col=['nerc', 'fuel category', 'datetime'],
                       parse_dates=['datetime'])
capacity.drop(['year', 'month'], axis=1, inplace=True)

In [21]:
capacity.head()


Out[21]:
active capacity possible gen
nerc fuel category datetime
ASCC Coal 2001-01-01 54.1 40250.4
2001-02-01 54.1 36355.2
2001-03-01 54.1 40250.4
2001-04-01 54.1 38952.0
2001-05-01 54.1 40250.4

Load split of natural gas capacity by prime mover type


In [22]:
path = join(cwd, '..', 'Data storage', 'Derived data',
            'Plant Capacity', 'monthly natural gas split.csv')

ng_capacity_type = pd.read_csv(path)
ng_capacity_type.head()


Out[22]:
nerc year month ngcc turbine other total ngcc fraction turbine fraction other fraction datetime
0 ASCC 2001 1 214.4 592.4 3.0 809.8 0.264757 0.731539 0.003705 2001-01-01
1 ASCC 2001 2 214.4 592.4 3.0 809.8 0.264757 0.731539 0.003705 2001-02-01
2 ASCC 2001 3 214.4 592.4 3.0 809.8 0.264757 0.731539 0.003705 2001-03-01
3 ASCC 2001 4 214.4 592.4 3.0 809.8 0.264757 0.731539 0.003705 2001-04-01
4 ASCC 2001 5 214.4 592.4 3.0 809.8 0.264757 0.731539 0.003705 2001-05-01

In [23]:
ng_type_avg = (ng_capacity_type.reset_index()
                               .groupby(['nerc', 'year'])
                               .mean())
ng_type_avg.head()


Out[23]:
index month ngcc turbine other total ngcc fraction turbine fraction other fraction
nerc year
ASCC 2001 5.5 6.5 214.4 594.900 3.0 812.300 0.263944 0.732362 0.003693
2002 17.5 6.5 252.3 597.400 3.0 852.700 0.295884 0.700598 0.003518
2003 29.5 6.5 252.3 584.900 3.0 840.200 0.300286 0.696144 0.003571
2004 41.5 6.5 252.3 590.025 3.0 845.325 0.298497 0.697954 0.003549
2005 53.5 6.5 252.3 579.700 3.0 835.000 0.302156 0.694251 0.003593

Combine NERC and national results

CO₂ intensity


In [24]:
index = pd.concat([national_index, nerc_index])

In [25]:
index.head()


Out[25]:
change since 2005 datetime elec fuel co2 (kg) final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month nerc quarter year
0 0.077267 2001-01-01 2.110627e+11 2.145929e+11 3.324839e+08 645.423413 1422.900456 1 USA 1.0 2001
1 0.061504 2001-02-01 1.762658e+11 1.799442e+11 2.829402e+08 635.979511 1402.080431 2 USA 1.0 2001
2 0.047523 2001-03-01 1.860660e+11 1.887242e+11 3.007065e+08 627.602684 1383.612877 3 USA 1.0 2001
3 0.047373 2001-04-01 1.717565e+11 1.744977e+11 2.780781e+08 627.513263 1383.415739 4 USA 2.0 2001
4 0.048761 2001-05-01 1.859817e+11 1.888085e+11 3.004857e+08 628.344387 1385.248036 5 USA 2.0 2001

Percent generation


In [26]:
per_gen = pd.concat([national_per_gen, nerc_per_gen])

per_gen.year = per_gen['datetime'].dt.year
per_gen.month = per_gen['datetime'].dt.month

In [27]:
per_gen.head()


Out[27]:
% generation datetime elec fuel co2 (kg) fuel category month nerc year
0 0.533220 2001-01-01 0.0 Coal 1 USA 2001
1 0.529212 2001-02-01 0.0 Coal 2 USA 2001
2 0.516347 2001-03-01 0.0 Coal 3 USA 2001
3 0.505867 2001-04-01 0.0 Coal 4 USA 2001
4 0.504493 2001-05-01 0.0 Coal 5 USA 2001

Total generation


In [28]:
national_gen.head()


Out[28]:
fuel category year month generation (mwh) datetime nerc
0 Coal 2001 1 177287111.0 2001-01-01 USA
1 Coal 2001 2 149735483.0 2001-02-01 USA
2 Coal 2001 3 155269010.0 2001-03-01 USA
3 Coal 2001 4 140670652.0 2001-04-01 USA
4 Coal 2001 5 151592915.0 2001-05-01 USA

In [29]:
cols = ['nerc', 'fuel category', 'datetime']
gen = pd.concat([national_gen.set_index(cols), nerc_gen])
gen.sort_index(inplace=True)
gen.drop(['% generation', 'total generation (mwh)'], axis=1,
         inplace=True)

In [30]:
gen.head()


Out[30]:
generation (mwh) month year
nerc fuel category datetime
ASCC Coal 2001-01-01 26493.302 NaN NaN
2001-02-01 31263.704 NaN NaN
2001-03-01 28459.894 NaN NaN
2001-04-01 20455.924 NaN NaN
2001-05-01 24096.734 NaN NaN

Figures


In [34]:
sns.set_style('white', {'axes.linewidth': 1.5,
                        'axes.grid': True})
sns.set_context('paper', font_scale=1.0)

Color palette for fuel categories

Use the Tableau 10 color palette


In [32]:
sns.palplot(sns.color_palette('tab10', n_colors=10))



In [33]:
fuel_order = [5, -1, 1, 0, 2, -2, 4, -3]
fuel_palette = [sns.color_palette('tab10', n_colors=10)[x]
                for x in fuel_order]
sns.palplot(fuel_palette)


Total generation


In [35]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
temp = (gen.reset_index()
           .groupby(['nerc', 'datetime'])
           .sum()
           .reset_index())
temp['Month'] = temp['datetime'].dt.month
temp['Year'] = temp['datetime'].dt.year
temp['generation (mwh)'] /= 1E6

# temp.rename(columns={'month': 'Month',
#                      'year': 'Year'}, inplace=True)
with sns.plotting_context('paper', font_scale=1):

    g = sns.factorplot(x='Month', y='generation (mwh)', hue='Year', 
                       data=temp, col='nerc', col_wrap=3, col_order=order, 
                       palette='viridis_r', scale=0.5, size=2, sharey=False)
    
    axes = g.axes.flatten()
    for ax, title in zip(axes, order):
        ax.set_title(title)
        ax.set_ylim(0, None)
        
        if title in ['USA', 'RFC', 'FRCC']:
            ax.set_ylabel('Million MWh')
            
path = join(cwd, '..', 'Figures', 'Monthly generation {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


Index figure


In [56]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
temp = index.reset_index()
temp.rename(columns={'month': 'Month',
                     'year': 'Year'}, inplace=True)
with sns.plotting_context('paper', font_scale=1):

    g = sns.factorplot(x='Month', y='index (g/kwh)', hue='Year', 
                       data=temp, col='nerc', col_wrap=3, col_order=order, 
                       palette='viridis_r', scale=0.5, size=2)

    axes = g.axes.flatten()
    for ax, title in zip(axes, order):
        ax.set_title(title)
        ax.set_ylim(0, 1050)
        if title in ['USA', 'RFC', 'FRCC']:
            ax.set_ylabel('g $\mathregular{CO_2 \ kWh^{-1}}$')
            
path = join(cwd, '..', 'Figures',
            'Monthly index (12-months) {}.pdf'.format(file_date))
# plt.savefig(path, bbox_inches='tight')


Alternate plot of CO₂ intensity


In [47]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
FG_kwargs = dict(hue='nerc',
                 col='nerc',
                 col_wrap=3,
                 aspect=1.2,
                 hue_order=order,
                 palette='tab10')

region_facet_grid(df=index.reset_index(), plot_function=plt.plot, x_axis='datetime',
                  add_legend=False, y_axis='index (g/kwh)', col_order=order, 
                  suptitle='', FG_kwargs=FG_kwargs, x_label='Year', 
                  y_label='g CO$_2$/kWh', context='talk', font_scale=1)
path = join(cwd, '..', 'Figures', 'SI', 'Monthly index by region.pdf')
plt.savefig(path, bbox_inches='tight')



In [29]:
index.head()


Out[29]:
change since 2005 datetime final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month nerc quarter year
0 0.076326 2001-01-01 2.149920e+11 332493160.0 646.604591 1425.504481 1 USA 1.0 2001
1 0.060812 2001-02-01 1.803140e+11 282940198.0 637.284923 1404.958340 2 USA 1.0 2001
2 0.046964 2001-03-01 1.891340e+11 300706544.0 628.965732 1386.617853 3 USA 1.0 2001
3 0.046860 2001-04-01 1.748850e+11 278078871.0 628.902838 1386.479197 4 USA 2.0 2001
4 0.048337 2001-05-01 1.892470e+11 300491621.0 629.790069 1388.435187 5 USA 2.0 2001

Plot of detrended intensity, using a linear detrend

Not a good method - trend isn't linear


In [48]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
FG_kwargs = dict(hue='nerc',
                 col='nerc',
                 col_wrap=3,
                 aspect=1.2,
                 hue_order=order,
                 palette='tab10')

detrend_index = index.copy()
detrend_index.set_index(['nerc', 'datetime'], inplace=True)
detrend_index.sort_index(inplace=True)
for nerc in order:
    detrend_index.loc[idx[nerc, :], 'index (g/kwh)'] = detrend(detrend_index.loc[idx[nerc, :], 'index (g/kwh)'])


region_facet_grid(df=detrend_index.reset_index(), plot_function=plt.plot, x_axis='datetime',
                  add_legend=True, y_axis='index (g/kwh)', col_order=order, 
                  suptitle='', FG_kwargs=FG_kwargs, x_label='Year',
                  context='talk', font_scale=1)


Plot of detrended intensity using a first difference detrend

Better than linear, but I'm not sure it's appropriate


In [49]:
def diff_detrend(series, n=1):
    d = series - series.shift(n)
    return d

In [50]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
FG_kwargs = dict(hue='nerc',
                 col='nerc',
                 col_wrap=3,
                 aspect=1.2,
                 hue_order=order,
                 palette='tab10')

detrend_index = index.copy()
detrend_index.set_index(['nerc', 'datetime'], inplace=True)
detrend_index.sort_index(inplace=True)
for nerc in order:
    detrend_index.loc[idx[nerc, :], 'index (g/kwh)'] = diff_detrend(detrend_index.loc[idx[nerc, :], 'index (g/kwh)'])


region_facet_grid(df=detrend_index.reset_index(), plot_function=plt.plot, x_axis='datetime',
                  add_legend=True, y_axis='index (g/kwh)', col_order=order, 
                  suptitle='', FG_kwargs=FG_kwargs, x_label='Year',
                  context='talk', font_scale=1)


Detrend by subtracting the rolling 1-year mean

Started by using seasonal_decompose from statsmodels. It was apparent that the trend component is just the rolling average. I'm leaving the extra code here as documentation of this fact.


In [37]:
from statsmodels.tsa.seasonal import seasonal_decompose


/Users/Home/anaconda/envs/py36/lib/python3.6/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools

In [53]:
wecc = index.loc[index['nerc'] == 'WECC', ['index (g/kwh)', 'datetime']]
wecc.set_index('datetime', inplace=True)
wecc.head()


Out[53]:
index (g/kwh)
datetime
2001-01-01 524.331566
2001-02-01 538.398535
2001-03-01 507.408432
2001-04-01 518.650607
2001-05-01 517.016964

In [54]:
sd = seasonal_decompose(wecc)

In [55]:
sd.plot()


Out[55]:

These three things are all the same:

  1. Subtracting just the trend from the WECC data
  2. Adding the seasonal and resid components
  3. Subtracting the annual rolling mean from the WECC data

In [56]:
fig, ax = plt.subplots(nrows=3, squeeze=True)
(wecc - sd.trend).plot(ax=ax[0], legend=False)
(sd.seasonal + sd.resid).plot(ax=ax[1], legend=False)
(wecc - wecc.rolling(12, center=True).mean()).plot(ax=ax[2], legend=False)
plt.tight_layout()



In [57]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
FG_kwargs = dict(hue='nerc',
                 col='nerc',
                 col_wrap=3,
                 aspect=1.2,
                 hue_order=order,
                 palette='tab10')

temp = index.copy()
temp.set_index('datetime', inplace=True)

df_list = []
for nerc in order:
    # Subtract the rolling mean from the raw data in each region
    sd = (temp.loc[temp['nerc'] == nerc, 'index (g/kwh)']
          - (temp.loc[temp['nerc'] == nerc, 'index (g/kwh)']
             .rolling(12, center=True)
             .mean()))
    # Have to make the series into a df to add the 'nerc' column
    sd = pd.DataFrame(sd)
    sd['nerc'] = nerc
    df_list.append(sd)
    
detrend_index = pd.concat(df_list)

region_facet_grid(df=detrend_index.reset_index(), plot_function=plt.plot, x_axis='datetime',
                  add_legend=False, y_axis='index (g/kwh)', col_order=order, 
                  suptitle='', FG_kwargs=FG_kwargs, x_label='Year',
                  context='talk', font_scale=1)


Create a combined figure for the SI

Include the raw monthly data, the rolling mean trend line, and the detrended data


In [38]:
temp = index.copy()
temp.set_index('datetime', inplace=True)

df_list = []
for nerc in order:
    # Subtract the rolling mean from the raw data in each region
    sd = (temp.loc[temp['nerc'] == nerc, 'index (g/kwh)']
          - (temp.loc[temp['nerc'] == nerc, 'index (g/kwh)']
             .rolling(12, center=True)
             .mean()))
    # Have to make the series into a df to add the 'nerc' column
    sd = pd.DataFrame(sd)
    sd['nerc'] = nerc
    df_list.append(sd)
    
detrend_index = pd.concat(df_list)

In [39]:
index.head()


Out[39]:
change since 2005 datetime final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month nerc quarter year
0 0.077477 2001-01-01 2.142842e+11 3.324839e+08 644.494871 1420.853394 1 USA 1.0 2001
1 0.061603 2001-02-01 1.796670e+11 2.829402e+08 635.000010 1399.921021 2 USA 1.0 2001
2 0.047543 2001-03-01 1.884196e+11 3.007065e+08 626.589712 1381.379680 3 USA 1.0 2001
3 0.047411 2001-04-01 1.742190e+11 2.780781e+08 626.510901 1381.205933 4 USA 2.0 2001
4 0.048812 2001-05-01 1.885094e+11 3.004857e+08 627.348927 1383.053445 5 USA 2.0 2001

In [40]:
detrend_index.head()


Out[40]:
index (g/kwh) nerc
datetime
2001-01-01 NaN USA
2001-02-01 NaN USA
2001-03-01 NaN USA
2001-04-01 NaN USA
2001-05-01 NaN USA

In [41]:
combined = index.copy()
combined.set_index(['nerc', 'datetime'], inplace=True)
df = detrend_index.reset_index(drop=False).set_index(['nerc', 'datetime'])
df.columns = ['detrend index']

In [42]:
combined = pd.concat([index.set_index(['nerc', 'datetime']), df])

In [43]:
combined.head()


Out[43]:
change since 2005 detrend index final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month quarter year
nerc datetime
USA 2001-01-01 0.077477 NaN 2.142842e+11 3.324839e+08 644.494871 1420.853394 1.0 1.0 2001.0
2001-02-01 0.061603 NaN 1.796670e+11 2.829402e+08 635.000010 1399.921021 2.0 1.0 2001.0
2001-03-01 0.047543 NaN 1.884196e+11 3.007065e+08 626.589712 1381.379680 3.0 1.0 2001.0
2001-04-01 0.047411 NaN 1.742190e+11 2.780781e+08 626.510901 1381.205933 4.0 2.0 2001.0
2001-05-01 0.048812 NaN 1.885094e+11 3.004857e+08 627.348927 1383.053445 5.0 2.0 2001.0

In [45]:
def plot_trend(x, y, **kws):
    'Plot function for the FacetGrid to plot all three data series'
    trend = y.rolling(12, center=True).mean()
    plt.plot(x, y, **kws)
    plt.plot(x, trend, lw=2, c='0.4')
    detrended = y - trend
    plt.plot(x, detrended, **kws)

order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
FG_kwargs = dict(hue='nerc',
                 col='nerc',
                 col_wrap=3,
                 aspect=1.2,
                 hue_order=order,
                 palette='tab10',
                 size=2)

region_facet_grid(df=index.reset_index(), plot_function=plot_trend, x_axis='datetime',
                  add_legend=False, y_axis='index (g/kwh)', col_order=order, 
                  suptitle='', FG_kwargs=FG_kwargs, x_label='Year', 
                  y_label='g CO$_2$/kWh', context='paper', font_scale=1)
path = join(cwd, '..', 'Figures', 'SI',
            'Monthly index trend and detrended by region {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


Percent gen figures


In [66]:
gen.head()


Out[66]:
generation (mwh) month year
nerc fuel category datetime
ASCC Coal 2001-01-01 26493.302 NaN NaN
2001-02-01 31263.704 NaN NaN
2001-03-01 28459.894 NaN NaN
2001-04-01 20455.924 NaN NaN
2001-05-01 24096.734 NaN NaN

In [105]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
fuel_order = ['Coal', 'Natural Gas', 'Nuclear', 'Hydro', 'Wind', 'Solar',
              'Other', 'Other Renewables']
temp = per_gen.copy()
temp.rename(columns={'fuel category': 'Fuel Category'}, inplace=True)

FG_kwargs = dict(hue='Fuel Category',
                 col='nerc',
                 col_wrap=3,
                 aspect=1.2,
                 hue_order=fuel_order,
                 palette=fuel_palette,
                 size=2)

g = region_facet_grid(df=temp, plot_function=plt.plot, x_axis='datetime',
                  add_legend=True, y_axis='% generation', col_order=order, 
                  suptitle='', FG_kwargs=FG_kwargs, x_label='Year',
                  y_label='Fraction Generation',
                  context='paper', font_scale=1)

path = join(cwd, '..', 'Figures', 'Generation by fuel_full {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


Capacity factor of each fuel by month


In [145]:
capacity['Capacity Factor'] = (gen.loc[:, 'generation (mwh)'] 
                               / capacity.loc[:, 'possible gen'])

In [146]:
capacity.tail()


Out[146]:
active capacity possible gen Capacity Factor
nerc fuel category datetime
WECC Wind 2017-08-01 19611.4 14590881.6 0.183756
2017-09-01 19611.4 14120208.0 0.196416
2017-10-01 19611.4 14590881.6 0.265022
2017-11-01 19611.4 14120208.0 0.246457
2017-12-01 19611.4 14590881.6 0.214962

In [113]:
# Monthly coal CF over the last 2 years
capacity.loc[idx['USA', 'Coal', :], :].tail(n=24)


Out[113]:
active capacity possible gen Capacity Factor year
nerc fuel category datetime
USA Coal 2016-01-01 274281.7 204065584.8 0.555995 2016
2016-02-01 274281.7 190900063.2 0.485620 2016
2016-03-01 273544.7 203517256.8 0.354626 2016
2016-04-01 269181.2 193810464.0 0.372079 2016
2016-05-01 268467.6 199739894.4 0.409004 2016
2016-06-01 268206.1 193108392.0 0.600877 2016
2016-07-01 268183.3 199528375.2 0.683193 2016
2016-08-01 268028.3 199413055.2 0.680170 2016
2016-09-01 267428.3 192548376.0 0.592775 2016
2016-10-01 267328.3 198892255.2 0.498732 2016
2016-11-01 267328.3 192476376.0 0.451694 2016
2016-12-01 266528.3 198297055.2 0.598833 2016
2017-01-01 266521.5 198291996.0 0.582469 2017
2017-02-01 266312.5 178962000.0 0.485172 2017
2017-03-01 265867.4 197805345.6 0.451868 2017
2017-04-01 264637.4 190538928.0 0.427658 2017
2017-05-01 264637.4 196890225.6 0.471495 2017
2017-06-01 262925.2 189306144.0 0.568860 2017
2017-07-01 262925.2 195616348.8 0.653860 2017
2017-08-01 262925.2 195616348.8 0.612034 2017
2017-09-01 262741.2 189173664.0 0.519913 2017
2017-10-01 262164.2 195050164.8 0.461599 2017
2017-11-01 262013.2 188649504.0 0.482905 2017
2017-12-01 260265.2 193637308.8 0.550121 2017

Annual capacity factors

Requested by reviewer as part of final revisions.


In [147]:
# Check annual capacity factors

# need to add a year column for groupby
capacity['year'] = capacity.index.get_level_values('datetime').year

gen['year'] = gen.index.get_level_values('datetime').year


annual_cf = (gen.groupby(['nerc', 'fuel category', 'year'])['generation (mwh)'].sum()
                .div(capacity.groupby(['nerc', 'fuel category', 'year'])['possible gen'].sum()))

In [148]:
# Annual coal CF since 2014
annual_cf.unstack().loc[idx[:, 'Coal'], 2014:]


Out[148]:
year 2014 2015 2016 2017
nerc fuel category
ASCC Coal 0.613988 0.734145 0.478458 0.149795
FRCC Coal 0.590910 0.475253 0.455963 0.440205
HICC Coal 0.958386 0.809775 0.947037 0.872478
MRO Coal 0.667566 0.629204 0.592030 0.616069
NPCC Coal 0.263495 0.176457 0.134279 0.093923
RFC Coal 0.583916 0.525260 0.495332 0.482226
SERC Coal 0.594136 0.508166 0.505878 0.486285
SPP Coal 0.654900 0.577541 0.527465 0.516552
TRE Coal 0.710524 0.564938 0.582903 0.665722
USA Coal 0.619880 0.549530 0.523666 0.522770
WECC Coal 0.735250 0.692965 0.622483 0.638176

Monthly CF plots


In [111]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
temp = capacity.loc[idx[:, 'Coal', :], :].reset_index()
temp['Month'] = temp['datetime'].dt.month
temp['Year'] = temp['datetime'].dt.year

with sns.plotting_context('paper', font_scale=1):

    g = sns.factorplot(x='Month', y='Capacity Factor', hue='Year', 
                       data=temp, col='nerc', col_wrap=3, col_order=order, 
                       palette='viridis_r', scale=0.5, size=2)

    axes = g.axes.flatten()
    for ax, title in zip(axes, order):
        ax.set_title(title)
#         ax.set_ylim(0, 1050)
        if title in ['USA', 'RFC', 'FRCC']:
            ax.set_ylabel('Capacity Factor\n(Coal)')
            
path = join(cwd, '..', 'Figures', 'SI', 'Monthly Coal CF {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')



In [112]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
temp = capacity.loc[idx[:, 'Natural Gas', :], :].reset_index()
temp['Month'] = temp['datetime'].dt.month
temp['Year'] = temp['datetime'].dt.year

with sns.plotting_context('paper', font_scale=1):

    g = sns.factorplot(x='Month', y='Capacity Factor', hue='Year', 
                       data=temp, col='nerc', col_wrap=3, col_order=order, 
                       palette='viridis_r', scale=0.5, size=2)

    axes = g.axes.flatten()
    for ax, title in zip(axes, order):
        ax.set_title(title)
#         ax.set_ylim(0, 1050)
        if title in ['USA', 'RFC', 'FRCC']:
            ax.set_ylabel('Capacity Factor\n(Natural Gas)')
            
path = join(cwd, '..', 'Figures', 'SI', 'Monthly Natural Gas CF {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


Plot of natural gas capacity type by NERC


In [67]:
order = ['MRO', 'RFC', 'SERC', 'FRCC', 'SPP', 'WECC', 'NPCC', 'TRE']

# Melt the wide dataframe into tidy format for plotting
temp = (ng_type_avg.reset_index()
                   .melt(id_vars=['nerc', 'year'],
                         value_vars=['ngcc fraction', 'turbine fraction', 'other fraction'],
                         var_name='type', value_name='fraction capacity'))

# Use the new tableau colorblind10 palette
with plt.style.context('tableau-colorblind10'):
    g = sns.factorplot(x='year', y='fraction capacity', data=temp, hue='nerc',
                       row='type', hue_order=order,
                       scale=1, aspect=1.4, ci=0).set_xticklabels(rotation=35)
    g.set_axis_labels(y_var='Fraction Capacity')
    
    #rename each of the subplot titles
    axs = g.axes.ravel()
    for ax, title in zip(axs, ['NGCC', 'Turbine', 'Other']):
        ax.set_title(title)
    
path = join(cwd, '..', 'Figures', 'SI', 'Natural gas capacity by type.pdf')
plt.savefig(path, bbox_inches='tight')


Monthly generation by each fuel type


In [44]:
def monthly_fuel_gen(gen_df, fuel, folder, file_type='pdf', dpi=350, save=False):
    """
    Make a FacetGrid plot of monthly generation for a single fuel category
    
    inputs:
        gen_df (dataframe): monthly generation for all fuels
        fuel (string): name of the fuel category to plot
        folder (path): folder where the plot should be saved
        file_type (string): file format (e.g. pdf, png, etc)
        dpi (int): dots per inch resolution for saved file (if not pdf)
        save (bool): if True, save the file
            
    """
    order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
    temp = gen.loc[idx[:, fuel, :], :].reset_index()
    temp['Month'] = temp['datetime'].dt.month
    temp['Year'] = temp['datetime'].dt.year
    temp['million mwh'] = temp['generation (mwh)'] / 1e6

    with sns.plotting_context('paper', font_scale=1):

        g = sns.factorplot(x='Month', y='million mwh', hue='Year', sharey=False,
                           data=temp, col='nerc', col_wrap=3, col_order=order, 
                           palette='viridis_r', scale=0.5, size=2)

        axes = g.axes.flatten()
        for ax, title in zip(axes, order):
            ax.set_title(title)
            ax.set_ylim(0, None)
    #         ax.set_ylim(0, 1050)
            if title in ['USA', 'RFC', 'FRCC']:
                ax.set_ylabel('Million MWh\n({})'.format(fuel))

    path = join(folder, 'Monthly {} gen {}.{}'.format(fuel, file_date, file_type))
    if save:
        plt.savefig(path, bbox_inches='tight', dpi=dpi)

In [46]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Coal', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Coal gen 2018-03-06.pdf

In [62]:
gen.head()


Out[62]:
generation (mwh) month year
nerc fuel category datetime
ASCC Coal 2001-01-01 26493.302 NaN 2001
2001-02-01 31263.704 NaN 2001
2001-03-01 28459.894 NaN 2001
2001-04-01 20455.924 NaN 2001
2001-05-01 24096.734 NaN 2001

In [67]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Natural Gas', folder=folder_path, file_date=file_date,
                 file_type='pdf', save=False)



In [45]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Natural Gas', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Natural Gas gen 2018-03-06.pdf

In [47]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Nuclear', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Nuclear gen 2018-03-06.pdf

In [48]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Wind', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Wind gen 2018-03-06.pdf

In [49]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Solar', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Solar gen 2018-03-06.pdf

In [50]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Other', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Other gen 2018-03-06.pdf

In [51]:
folder_path = join(cwd, '..', 'Figures', 'SI')
monthly_fuel_gen(gen, fuel='Other Renewables', folder=folder_path, file_type='pdf', save=True)


/Users/Home/Documents/GitHub/Index-variability/Notebooks/../Figures/SI/Monthly Other Renewables gen 2018-03-06.pdf

In [108]:
order = ['USA', 'SPP', 'MRO', 'RFC', 'SERC', 'TRE', 'FRCC', 'WECC', 'NPCC']
years = range(2001, 2017, 3)
temp = gen.loc[idx[:, 'Coal', :], :].reset_index()
temp['Month'] = temp['datetime'].dt.month
temp['Year'] = temp['datetime'].dt.year
temp['coal generation'] = temp['generation (mwh)'] / 1e6
temp = temp.loc[temp['Year'].isin(years), :]

total_gen = (gen.reset_index()
           .groupby(['nerc', 'datetime'])
           .sum()
           .reset_index())
total_gen.rename(columns={'generation (mwh)': 'total generation'}, inplace=True)
total_gen['total generation'] /= 1e6

temp = temp.merge(total_gen, on=['nerc', 'datetime'])

def scatter_coal_total(x, y, *args, **kwargs):
    plt.plot(x, y, 'o', markersize=6, markeredgewidth=1, markerfacecolor='None', **kwargs)
    line_kws = dict(linewidth=1.5)
    sns.regplot(x, y, x_ci=None, lowess=True, n_boot=1, line_kws=line_kws,
                scatter=False, **kwargs)

with sns.plotting_context('paper', font_scale=1):

    g = sns.FacetGrid(data=temp, hue='Year', sharey=False, sharex=False,
                      col='nerc', col_wrap=3, col_order=order, 
                       palette='viridis_r', size=2)
    g.map(scatter_coal_total, 'total generation', 'coal generation')
    g.add_legend()

    axes = g.axes.flatten()
    for ax, title in zip(axes, order):
        ax.set_title(title)
#         ax.set_ylim(0, None)
#         ax.set_ylim(0, 1050)
#         if title in ['USA', 'RFC', 'FRCC']:
#             ax.set_ylabel('Million MWh Coal')
            
path = join(cwd, '..', 'Figures', 'Total vs coal gen {}.pdf'.format(file_date))
# plt.savefig(path, bbox_inches='tight')



In [112]:
g._legend_data['2001'].get_linewidth()


Out[112]:
1.5

Figure of wind, nat gas, and coal

Show how generation from these three resources has changed over time in SPP, TRE, and MRO.


In [159]:
per_gen.loc[(per_gen['fuel category'].isin(['Wind', 'Coal', 'Natural Gas'])) & 
                             (per_gen.nerc.isin(order)) &
                             (per_gen.year >= 2005)].head()


Out[159]:
% generation datetime elec fuel co2 (kg) fuel category month nerc year
3639 0.748146 2005-01-01 NaN Coal 1 MRO 2005
3640 0.754733 2005-02-01 NaN Coal 2 MRO 2005
3641 0.753462 2005-03-01 NaN Coal 3 MRO 2005
3642 0.751070 2005-04-01 NaN Coal 4 MRO 2005
3643 0.748846 2005-05-01 NaN Coal 5 MRO 2005

In [160]:
index.head()


Out[160]:
change since 2005 datetime elec fuel co2 (kg) final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month nerc quarter year
0 0.077267 2001-01-01 2.110627e+11 2.145929e+11 3.324839e+08 645.423413 1422.900456 1 USA 1.0 2001
1 0.061504 2001-02-01 1.762658e+11 1.799442e+11 2.829402e+08 635.979511 1402.080431 2 USA 1.0 2001
2 0.047523 2001-03-01 1.860660e+11 1.887242e+11 3.007065e+08 627.602684 1383.612877 3 USA 1.0 2001
3 0.047373 2001-04-01 1.717565e+11 1.744977e+11 2.780781e+08 627.513263 1383.415739 4 USA 2.0 2001
4 0.048761 2001-05-01 1.859817e+11 1.888085e+11 3.004857e+08 628.344387 1385.248036 5 USA 2.0 2001

In [161]:
capacity.head()


Out[161]:
active capacity possible gen Capacity Factor year
nerc fuel category datetime
ASCC Coal 2001-01-01 54.1 40250.4 0.658212 2001
2001-02-01 54.1 36355.2 0.859951 2001
2001-03-01 54.1 40250.4 0.707071 2001
2001-04-01 54.1 38952.0 0.525157 2001
2001-05-01 54.1 40250.4 0.598671 2001

In [75]:
order = ['SPP', 'TRE', 'MRO']
fuels = ['Coal', 'Natural Gas', 'Nuclear', 'Hydro', 'Wind', 'Solar']
per_gen.year = per_gen.year.astype(int)
temp_df = per_gen.loc[(per_gen['fuel category'].isin(['Wind', 'Coal', 'Natural Gas'])) & 
                             (per_gen.nerc.isin(order)) &
                             (per_gen.year >= 2005)].copy()

def early_late(x):
    'Add classifications to each month based on the year'
    if x <= 2008:
        return '2005-2008'
    elif x <= 2012:
        return '2009-2012'
    elif x <= 2016:
        return '2013-2017'
    else:
        return ValueError

temp_df.loc[:, 'timeframe'] = temp_df.loc[:, 'year'].map(early_late)
merge_cols = ['nerc', 'month', 'year']
temp_df = pd.merge(temp_df, index.reset_index()[merge_cols + ['generation (mwh)']],
                   on=merge_cols)
temp_df['generation'] = (temp_df.loc[:, '% generation']
                         * temp_df.loc[:, 'generation (mwh)'] / 1e6)

# temp_df = temp_df.merge(capacity.reset_index(), how='left',
#                         on=['nerc', 'fuel category', 'datetime'])

if 'year_x' in temp_df.columns:
    temp_df.rename(columns={'year_x': 'year'}, inplace=True)

In [76]:
temp_df.head()


Out[76]:
% generation datetime elec fuel co2 (kg) fuel category month nerc year timeframe generation (mwh) generation
0 0.748146 2005-01-01 NaN Coal 1 MRO 2005 2005-2008 1.820807e+07 13.622289
1 0.029781 2005-01-01 NaN Natural Gas 1 MRO 2005 2005-2008 1.820807e+07 0.542249
2 0.012254 2005-01-01 NaN Wind 1 MRO 2005 2005-2008 1.820807e+07 0.223117
3 0.754733 2005-02-01 NaN Coal 2 MRO 2005 2005-2008 1.515471e+07 11.437763
4 0.033670 2005-02-01 NaN Natural Gas 2 MRO 2005 2005-2008 1.515471e+07 0.510253

In [45]:
# a is the list of colors
# Matching fuel colors to the generation plot above
a = fuel_palette[0:2]
a.append(fuel_palette[4])

seaborn tsplot issues

When running the full code below (tsplot within a FacetGrid) I'm getting an error. The data works for tsplot


In [77]:
sns.tsplot(temp_df.loc[temp_df.nerc=='MRO'], time='month', unit='year', value='generation', 
                    condition='fuel category', err_style='unit_traces')


/Users/Home/anaconda/envs/psci/lib/python3.6/site-packages/seaborn/timeseries.py:183: UserWarning: The tsplot function is deprecated and will be removed or replaced (in a substantially altered version) in a future release.
  warnings.warn(msg, UserWarning)
Out[77]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a1d66fe80>

In [117]:
from matplotlib.lines import Line2D
order = ['SPP', 'TRE', 'MRO']
col_order = ['2005-2008', '2009-2012', '2013-2017']

with sns.plotting_context(font_scale=1):
    g = sns.FacetGrid(temp_df, col='timeframe', row='nerc',
                      col_order=col_order, row_order=order,# aspect=1.2,
                      size=2)
#     g.map_dataframe(sns.tsplot, time='month', unit='year', value='generation', 
#                     condition='fuel category', err_style='unit_traces',
#                     color=a)
    
    
    titles = ['{}, {}'.format(a, b) for (a, b) in 
              list(itertools.product(order, col_order))]
    axes = g.axes.flatten()
    
    # This is hacky, but I'm having trouble gettign tsplot to work with FacetGrid
    # As a workaround, manually pass data from g.facet_data() - which returns a
    # tuple - to tsplot for every axis of the figure. Then manually set the axis
    # labels and create the legend.
    for ax, title, data in zip(axes, titles, g.facet_data()):
        sns.tsplot(data[-1], time='month', unit='year', value='generation', 
                    condition='fuel category', err_style='unit_traces',
                    color=a, ax=ax, legend=False)
        ax.set_title(title)
        ax.set_xticks([1, 4, 7, 10])
        
        ax.set_xlabel('')
        ax.set_ylabel('')
        
        if '2005' in title:
            ax.set_ylabel('Million MWh')
            
        if 'MRO' in title:
            ax.set_xlabel('Month')
    
    # Need to use tight_layout to get the facet spacing correct
    g.fig.tight_layout()
    
    # Create legend lines
    legend_lines = {fuel: Line2D([0], [0], color=c, lw=1.5, label=fuel)
                    for c, fuel in zip(a, ['Coal', 'Natural Gas', 'Wind'])}
    g.add_legend(legend_data=legend_lines, title='Fuel Category')
            
path = join(cwd, '..', 'Figures',
            '_Wind-Coal-NG variability tsplot {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


/Users/Home/anaconda/envs/psci/lib/python3.6/site-packages/seaborn/timeseries.py:183: UserWarning: The tsplot function is deprecated and will be removed or replaced (in a substantially altered version) in a future release.
  warnings.warn(msg, UserWarning)

In [98]:
g._legend_data


Out[98]:
{}

In [66]:
order = ['SPP', 'TRE', 'MRO']
col_order = ['2005-2008', '2009-2012', '2013-2017']

with sns.plotting_context(font_scale=1):
    g = sns.FacetGrid(temp_df, col='timeframe', row='nerc',
                      col_order=col_order, row_order=order, aspect=1.2,
                      size=2)
    g.map_dataframe(sns.tsplot, time='month', unit='year', value='generation', 
                    condition='fuel category', err_style='unit_traces',
                    color=a).add_legend(title='Fuel Category')
    g.set_axis_labels('Month', 'Million MWh')
    titles = ['{}, {}'.format(a, b) for (a, b) in 
              list(itertools.product(order, col_order))]
    axes = g.axes.flatten()
    for ax, title in zip(axes, titles):
        ax.set_title(title)
        ax.set_xticks([1, 4, 7, 10])

path = join(cwd, '..', 'Figures',
            'Wind-Coal-NG variability tsplot {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


/Users/Home/anaconda/envs/py36/lib/python3.6/site-packages/seaborn/timeseries.py:183: UserWarning: The tsplot function is deprecated and will be removed or replaced (in a substantially altered version) in a future release.
  warnings.warn(msg, UserWarning)

In [126]:
def rolling_norm_std_plot(temp_df, window=24):
    rolling_var_list = []
    for nerc in ['TRE', 'SPP', 'MRO']:
        for fuel in ['Coal', 'Natural Gas', 'Wind']:
            _df = temp_df.loc[(temp_df['nerc'] == nerc) &
                              (temp_df['fuel category'] == fuel),
                             ['nerc', 'fuel category', 'datetime', 'generation']]

            # I'm using std to mean variability
            _df['variability'] = (_df['generation']
                                  .rolling(window, center=True)
                                  .std())
            _df['variance'] = (_df['generation']
                                  .rolling(window, center=True)
                                  .var())
            _df['mean'] = (_df['generation']
                                  .rolling(window, center=True)
                                  .mean())

            _df['norm variability'] = _df['variability'] / _df['mean']

            _df['dispersion'] = _df['variance'] / _df['mean']

            rolling_var_list.append(_df)

    rolling_var_df = pd.concat(rolling_var_list)
    
    g = sns.FacetGrid(rolling_var_df, hue='fuel category', col='nerc', 
                  col_order=['SPP', 'TRE', 'MRO'], palette=a, size=2)
    g.map(plt.plot, 'datetime', 'norm variability').add_legend(title='Fuel Category')
    g.set_xticklabels(rotation=25)

    g.set_axis_labels('Year', 'Normalized Variability')
    axes = g.axes.flatten()
    for ax, nerc in zip(axes, ['SPP', 'TRE', 'MRO']):
        ax.set_title(nerc)

In [128]:
rolling_norm_std_plot(temp_df, window=24)
path = join(cwd, '..', 'Figures',
            'Wind-Coal-NG norm rolling variability {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')



In [57]:
rolling_norm_std_plot(temp_df, window=12)



In [58]:
rolling_norm_std_plot(temp_df, window=36)


Correlation figures

Creating a wide dataframe and then melting it to a tidy one is extra work. Might want to come back and create a tidy df from the start.


In [48]:
# Pairs of adjacent NERC regions
region_pairs = [
#     ('WECC', 'MRO'),
    ('WECC', 'SPP'),
    ('WECC', 'TRE'),
    ('TRE', 'SPP'),
    ('TRE', 'SERC'),
    ('SPP', 'MRO'),
    ('SPP', 'SERC'),
    ('SERC', 'FRCC'),
    ('SERC', 'MRO'),
    ('RFC', 'SERC'),
    ('RFC', 'NPCC'),
    ('RFC', 'MRO'),
    ('MRO', 'NPCC'),
    ('MRO', 'WECC')
]

In [49]:
index.reset_index(inplace=True)
index.set_index(['nerc', 'datetime'], inplace=True)
index.sort_index(inplace=True)

In [51]:
rolling_corr_plot(index, region_pairs=region_pairs, window=48,
                       detrend_series=True, seasonal=True, fill_alpha=0.4)
path = join(cwd, '..', 'Figures',
            '__Rolling correlation - seasonal detrend 48m centered {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')



In [54]:
rolling_corr_plot(index, region_pairs=region_pairs, window=48,
                       detrend_series=True, seasonal=True, fill_alpha=0.4)
path = join(cwd, '..', 'Figures',
            '_Rolling correlation - seasonal detrend 48m centered {}.pdf'.format(file_date))
plt.savefig(path, bbox_inches='tight')


Additional rolling windows for comparison and the SI


In [85]:
rolling_corr_plot(index, region_pairs=region_pairs, window=12,
                  detrend_series=True, seasonal=True,
                  sup_title='12-Month Window')
path = join(cwd, '..', 'Figures', 'SI',
            'Rolling correlation - seasonal detrend 12m centered.pdf')
plt.savefig(path, bbox_inches='tight')



In [86]:
rolling_corr_plot(index, region_pairs=region_pairs, window=24,
                  detrend_series=True, seasonal=True,
                  sup_title='24-Month Window')
path = join(cwd, '..', 'Figures', 'SI',
            'Rolling correlation - seasonal detrend 24m centered.pdf')
plt.savefig(path, bbox_inches='tight')



In [87]:
rolling_corr_plot(index, region_pairs=region_pairs, window=36,
                  detrend_series=True, seasonal=True, 
                  sup_title='36-Month Window')
path = join(cwd, '..', 'Figures', 'SI',
            'Rolling correlation - seasonal detrend 36m centered.pdf')
plt.savefig(path, bbox_inches='tight')



In [88]:
rolling_corr_plot(index, region_pairs=region_pairs, window=72,
                  detrend_series=True, seasonal=True,
                  sup_title='72-Month Window')
path = join(cwd, '..', 'Figures', 'SI',
            'Rolling correlation - seasonal detrend 72m centered.pdf')
plt.savefig(path, bbox_inches='tight')


Changes in CO₂ intensity for each NERC region


In [129]:
index.head()


Out[129]:
index change since 2005 elec fuel co2 (kg) final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month quarter year
nerc datetime
ASCC 2001-01-01 0 NaN NaN 2.780216e+08 510124.027 545.007802 NaN 1 NaN 2001
2001-02-01 1 NaN NaN 2.694399e+08 457691.568 588.693180 NaN 2 NaN 2001
2001-03-01 2 NaN NaN 2.944447e+08 493876.659 596.190762 NaN 3 NaN 2001
2001-04-01 3 NaN NaN 2.357533e+08 417279.658 564.976768 NaN 4 NaN 2001
2001-05-01 4 NaN NaN 2.306594e+08 409477.108 563.302275 NaN 5 NaN 2001

In [130]:
annual_index = (index#.drop('index', axis=1)
                     #.reset_index()
                     .groupby(['year', 'nerc'])
                     .sum())
annual_index['index (g/kwh)'] = (annual_index['final co2 (kg)']
                                 / annual_index['generation (mwh)'])
annual_index.sort_index(inplace=True)

In [131]:
annual_index.head()


Out[131]:
index change since 2005 elec fuel co2 (kg) final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month quarter
year nerc
2001 ASCC 66 0.0 0.0 3.070633e+09 5.527373e+06 555.532109 0.0 78 0.0
FRCC 210 0.0 0.0 1.179943e+11 1.821597e+08 647.751776 0.0 78 0.0
HICC 354 0.0 0.0 8.294079e+09 1.043156e+07 795.094610 0.0 78 0.0
MRO 498 0.0 0.0 1.622004e+11 1.898481e+08 854.369109 0.0 78 0.0
NPCC 642 0.0 0.0 1.069848e+11 2.580338e+08 414.615688 0.0 78 0.0

In [132]:
annual_index.loc[idx[:, 'USA'], :]


Out[132]:
index change since 2005 elec fuel co2 (kg) final co2 (kg) generation (mwh) index (g/kwh) index (lb/mwh) month quarter
year nerc
2001 USA 66 0.606058 2.324500e+12 2.354064e+12 3.736520e+09 630.015064 16650.621982 78 30.0
2002 USA 210 0.314713 2.338800e+12 2.374177e+12 3.858190e+09 615.360155 16265.800138 78 30.0
2003 USA 354 0.376908 2.363502e+12 2.401018e+12 3.883111e+09 618.323248 16347.949716 78 30.0
2004 USA 498 0.023699 2.371580e+12 2.384307e+12 3.970519e+09 600.502724 15881.416116 78 30.0
2005 USA 642 -0.006651 2.433250e+12 2.429592e+12 4.055197e+09 599.130379 15841.328792 78 30.0
2006 USA 786 -0.278066 2.382162e+12 2.380147e+12 4.064621e+09 585.576580 15482.833029 78 30.0
2007 USA 930 -0.224737 2.446017e+12 2.444883e+12 4.156669e+09 588.183198 15553.271662 78 30.0
2008 USA 1074 -0.375223 2.392687e+12 2.391431e+12 4.119315e+09 580.540770 15354.503153 78 30.0
2009 USA 1218 -0.981000 2.178342e+12 2.175400e+12 3.950295e+09 550.693083 14554.367662 78 30.0
2010 USA 1362 -0.842260 2.297302e+12 2.302771e+12 4.124852e+09 558.267590 14737.621193 78 30.0
2011 USA 1506 -1.236481 2.198723e+12 2.210254e+12 4.099904e+09 539.098894 14216.917323 78 30.0
2012 USA 1650 -1.749856 2.071110e+12 2.078066e+12 4.047592e+09 513.407988 13538.828654 78 30.0
2013 USA 1794 -1.695420 2.090106e+12 2.095628e+12 4.065839e+09 515.423252 13610.730107 78 30.0
2014 USA 1938 -1.812569 2.085528e+12 2.092147e+12 4.104659e+09 509.700555 13455.994941 78 30.0
2015 USA 2082 -2.412261 1.946828e+12 1.967419e+12 4.091520e+09 480.852898 12663.896467 78 30.0
2016 USA 2226 -3.002104 1.831294e+12 1.854639e+12 4.095209e+09 452.880147 11884.806641 78 30.0
2017 USA 2370 -3.253985 1.750835e+12 1.769787e+12 4.034598e+09 438.652664 11552.111724 78 30.0

In [151]:
regions = [
    'TRE',
    'SERC',
    'WECC',
    'USA',
    'RFC',
    'FRCC',
    'NPCC',
    'SPP',
    'MRO'
]
index2001 = annual_index.loc[2001, 'index (g/kwh)'][regions]
index2017 = annual_index.loc[2017, 'index (g/kwh)'][regions]

table_df = pd.DataFrame(data={2001:index2001, 2017:index2017})
table_df['Reduction'] = table_df[2001] - table_df[2017]
table_df['Percent Reduction'] = table_df['Reduction'] / table_df[2001]
table_df.sort_values('Reduction', inplace=True)

Export for use in the NERC map


In [153]:
path = join(cwd, '..', 'Data storage', 'final NERC data',
            'Summary table {}.csv'.format(file_date))
table_df.to_csv(path)

Annual CO₂ intensity in 2001 and 2017, and change in intensity


In [152]:
table_df['Annual Reduction'] = table_df['Reduction'] / (2017 - 2001)
table_df


Out[152]:
2001 2017 Reduction Percent Reduction Annual Reduction
nerc
TRE 609.537529 439.289665 170.247864 0.279307 10.640492
WECC 520.591372 346.313227 174.278145 0.334770 10.892384
USA 630.015064 438.652664 191.362400 0.303743 11.960150
SERC 633.312208 425.263073 208.049135 0.328510 13.003071
RFC 675.979943 461.569398 214.410546 0.317185 13.400659
FRCC 647.751776 410.050361 237.701415 0.366964 14.856338
NPCC 414.615688 172.799260 241.816428 0.583230 15.113527
MRO 854.369109 545.642751 308.726358 0.361350 19.295397
SPP 854.155315 491.243974 362.911341 0.424877 22.681959

In [ ]: