Supply curve figures of coal and petroleum resources

Sources I used for parts of this, and other things that might be helpful:

I had trouble finding a default color set with 11 colors for the petroleum data. Eventually discovered Palettable, and am using Paired_11 here. It is not my first choice of a color plot, but is one of the few defaults with that many colors. I generally recommend using one of the default seaborn color palettes such as muted or deep. Check out the seaborn tutorial for a basic primer colors.


In [2]:
%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import matplotlib.path as path
from palettable.colorbrewer.qualitative import Paired_11

Load coal data

Data is from:

McGlade, C & Ekins, P. The geographical distribution of fossil fuels unused when limiting global warming to 2 °C. Nature 517, 187–190. (2015) doi:10.1038/nature14016

Coal data from Figure 1c.


In [4]:
fn = 'nature14016-f1.xlsx'
sn = 'Coal data'

In [5]:
coal_df = pd.read_excel(fn, sn)

Fortuntely the Cost values are already sorted in ascending order. Cost will be on the y-axis, and cumulative recoverable resources will be on the x-axis.


In [6]:
coal_df.head()


Out[6]:
Resource Quantity (ZJ) Cost (2010$/GJ)
0 Lignite reserves 0.022 0.829
1 Lignite reserves 0.275 0.934
2 Lignite reserves 0.021 1.145
3 Lignite reserves 0.114 1.255
4 Lignite reserves 0.000 1.356

In [7]:
coal_df.tail()


Out[7]:
Resource Quantity (ZJ) Cost (2010$/GJ)
69 Lignite resources 0.081 9.814
70 Lignite resources 1.075 9.980
71 Lignite resources 0.057 10.442
72 Hard coal resources 0.162 11.651
73 Lignite resources 0.020 14.497

In [20]:
names = coal_df['Resource'].values
amount = coal_df['Quantity (ZJ)'].values
cost = coal_df['Cost (2010$/GJ)'].values

Create a set of names to use for assigning colors and creating the legend

I'm not being picky about the order of colors.


In [21]:
name_set = set(names)
name_set


Out[21]:
{u'Hard coal reserves',
 u'Hard coal resources',
 u'Lignite reserves',
 u'Lignite resources'}

In [22]:
color_dict = {}

for i, area in enumerate(name_set): 
    color_dict[area] = i #Assigning index position as value to resource name keys

In [23]:
color_dict


Out[23]:
{u'Hard coal reserves': 3,
 u'Hard coal resources': 0,
 u'Lignite reserves': 2,
 u'Lignite resources': 1}

In [24]:
sns.color_palette('deep', n_colors=4, desat=.8)


Out[24]:
[(0.33725490196078434, 0.45647058823529396, 0.6509803921568628),
 (0.36588235294117644, 0.6262745098039216, 0.4254901960784314),
 (0.7223529411764706, 0.35215686274509816, 0.36470588235294116),
 (0.5192156862745098, 0.47215686274509805, 0.6729411764705882)]

In [25]:
sns.palplot(sns.color_palette('deep', n_colors=4, desat=.8))


Define a function that returns the integer color choice based on the region name

Use the function color_match to create a Series with rgb colors that will be used for each box in the figure. Do this using the map operation, which applies a function to each element in a Pandas Series.


In [26]:
def color_match(name):
    return sns.color_palette('deep', n_colors=4, desat=.8)[color_dict[name]]

color has rgb values for each resource


In [28]:
color = coal_df['Resource'].map(color_match)

In [29]:
color.head()


Out[29]:
0    (0.722352941176, 0.352156862745, 0.364705882353)
1    (0.722352941176, 0.352156862745, 0.364705882353)
2    (0.722352941176, 0.352156862745, 0.364705882353)
3    (0.722352941176, 0.352156862745, 0.364705882353)
4    (0.722352941176, 0.352156862745, 0.364705882353)
Name: Resource, dtype: object

Define the edges of the patch objects that will be drawn on the plot


In [30]:
# get the corners of the rectangles for the histogram
left = np.cumsum(np.insert(amount, 0, 0))
right = np.cumsum(np.append(amount, .01))
bottom = np.zeros(len(left))
top = np.append(cost, 0)

Make the figure (coal)


In [37]:
sns.set_style('whitegrid')
fig, ax = plt.subplots(figsize=(10,5))

# we need a (numrects x numsides x 2) numpy array for the path helper
# function to build a compound path
for i, name in enumerate(names):
    XY = np.array([[left[i:i+1], left[i:i+1], right[i:i+1], right[i:i+1]], 
                      [bottom[i:i+1], top[i:i+1], top[i:i+1], bottom[i:i+1]]]).T

    # get the Path object
    barpath = path.Path.make_compound_path_from_polys(XY)

    # make a patch out of it (a patch is the shape drawn on the plot)
    patch = patches.PathPatch(barpath, facecolor=color[i], ec='0.2')
    ax.add_patch(patch)
    
#Create patch elements for a custom legend
#The legend function expects multiple patch elements as a list
patch = [patches.Patch(color=sns.color_palette('deep', 4, 0.8)[color_dict[i]], label=i)
         for i in color_dict]

# Axis labels/limits, remove horizontal gridlines, etc
plt.ylabel('Cost (2010$/GJ)', size=14)
plt.xlabel('Quantity (ZJ)', size=14)
ax.set_xlim(left[0], right[-1])
ax.set_ylim(bottom.min(), 12)
ax.yaxis.grid(False)
ax.xaxis.grid(False)

#remove top and right spines (box lines around figure)
sns.despine() 

#Add the custom legend 
plt.legend(handles=patch, loc=2, fontsize=12)
plt.savefig('Example Supply Curve (coal).png')


Load oil data

Data is from:

McGlade, C & Ekins, P. The geographical distribution of fossil fuels unused when limiting global warming to 2 °C. Nature 517, 187–190. (2015) doi:10.1038/nature14016

I'm using data from Figure 1a.


In [11]:
fn = 'nature14016-f1.xlsx'
sn = 'Oil data'

In [12]:
df = pd.read_excel(fn, sn)

Fortuntely the Cost values are already sorted in ascending order. Cost will be on the y-axis, and cumulative recoverable resources will be on the x-axis.


In [13]:
df.head()


Out[13]:
Resource Quantity (Gb) Cost (2010$/bbl)
0 Conventional 2P reserves in production or sche... 219.191 11.730
1 NGL 46.493 11.731
2 Conventional 2P reserves in production or sche... 4.599 12.602
3 NGL 1.345 12.603
4 Conventional 2P reserves in production or sche... 131.514 14.304

In [14]:
df.tail()


Out[14]:
Resource Quantity (Gb) Cost (2010$/bbl)
322 In situ kerogen oil 23.538 98.038
323 Undiscovered 0.173 99.030
324 NGL 0.040 99.031
325 Arctic 13.175 108.030
326 NGL 6.896 108.031

Create arrays of values with easy to type names


In [15]:
names = df['Resource'].values
amount = df['Quantity (Gb)'].values
cost = df['Cost (2010$/bbl)'].values

Create a set of names to use for assigning colors and creating the legend

I'm not being picky about the order of colors.


In [16]:
name_set = set(names)
name_set


Out[16]:
{u'Arctic',
 u'Conventional 2P reserves in production or scheduled',
 u'Extra-heavy',
 u'In situ kerogen oil',
 u'In situ natural bitumen',
 u'LTO',
 u'Mined kerogen oil',
 u'Mined natural bitumen',
 u'NGL',
 u'Reserve Growth',
 u'Undiscovered'}

In [17]:
color_dict = {}

for i, area in enumerate(name_set): 
    color_dict[area] = i #Assigning index position as value to resource name keys

In [18]:
color_dict


Out[18]:
{u'Arctic': 3,
 u'Conventional 2P reserves in production or scheduled': 7,
 u'Extra-heavy': 4,
 u'In situ kerogen oil': 0,
 u'In situ natural bitumen': 5,
 u'LTO': 9,
 u'Mined kerogen oil': 8,
 u'Mined natural bitumen': 10,
 u'NGL': 2,
 u'Reserve Growth': 6,
 u'Undiscovered': 1}

In [53]:
sns.palplot(Paired_11.mpl_colors)


Define a function that returns the integer color choice based on the region name

Use the function color_match to create a Series with rgb colors that will be used for each box in the figure. Do this using the map operation, which applies a function to each element in a Pandas Series.


In [119]:
def color_match(name):
    return Paired_11.mpl_colors[color_dict[name]]

In [102]:
def color_match(name):
    return sns.husl_palette(n_colors=11, h=0.1, s=0.9, l=0.6)[color_dict[name]]

In [143]:
color_match('NGL')


Out[143]:
(0.6980392156862745, 0.8745098039215686, 0.5411764705882353)

In [120]:
color = df['Resource'].map(color_match)

Define the edges of the patch objects that will be drawn on the plot


In [87]:
# get the corners of the rectangles for the histogram
left = np.cumsum(np.insert(amount, 0, 0))
right = np.cumsum(np.append(amount, .01))
bottom = np.zeros(len(left))
top = np.append(cost, 0)

Make the figure


In [145]:
sns.set_style('whitegrid')
fig, ax = plt.subplots(figsize=(10,5))

# we need a (numrects x numsides x 2) numpy array for the path helper
# function to build a compound path
for i, name in enumerate(names):
    XY = np.array([[left[i:i+1], left[i:i+1], right[i:i+1], right[i:i+1]], 
                      [bottom[i:i+1], top[i:i+1], top[i:i+1], bottom[i:i+1]]]).T

    # get the Path object
    barpath = path.Path.make_compound_path_from_polys(XY)

    # make a patch out of it (a patch is the shape drawn on the plot)
    patch = patches.PathPatch(barpath, facecolor=color[i], ec='0.8')
    ax.add_patch(patch)
    
#Create patch elements for a custom legend
#The legend function expects multiple patch elements as a list
patch = []
for i in color_dict:
    patch.append(patches.Patch(color=Paired_11.mpl_colors[color_dict[i]], 
                                label=i))

# Axis labels/limits, remove horizontal gridlines, etc
plt.ylabel('Cost (2010$/bbl)', size=14)
plt.xlabel('Quantity (Gb)', size=14)
ax.set_xlim(left[0], right[-2])
ax.set_ylim(bottom.min(), 120)
ax.yaxis.grid(False)
ax.xaxis.grid(False)

#remove top and right spines (box lines around figure)
sns.despine() 

#Add the custom legend 
plt.legend(handles=patch, loc=2, fontsize=12,
          ncol=2)
plt.savefig('Example Supply Curve.png')



In [ ]: