Supply curve figures of coal and petroleum resources

Sources I used for parts of this, and other things that might be helpful:

Patches to make a histogram
Text properties and layout
Add text with arrow line pointing somewhere (annotate) (you can probably get rid of the arrow head for just a line)

I had trouble finding a default color set with 11 colors for the petroleum data. Eventually discovered Palettable, and am using Paired_11 here. It is not my first choice of a color plot, but is one of the few defaults with that many colors. I generally recommend using one of the default seaborn color palettes such as muted or deep. Check out the seaborn tutorial for a basic primer colors.



In [2]:

    
%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import matplotlib.path as path
from palettable.colorbrewer.qualitative import Paired_11

Load coal data

Data is from:

McGlade, C & Ekins, P. The geographical distribution of fossil fuels unused when limiting global warming to 2 °C. Nature 517, 187–190. (2015) doi:10.1038/nature14016

Coal data from Figure 1c.



In [4]:

    
fn = 'nature14016-f1.xlsx'
sn = 'Coal data'



In [5]:

    
coal_df = pd.read_excel(fn, sn)

Fortuntely the Cost values are already sorted in ascending order. Cost will be on the y-axis, and cumulative recoverable resources will be on the x-axis.



In [6]:

    
coal_df.head()









    Out[6]:






  
    
      
      Resource
      Quantity (ZJ)
      Cost (2010$/GJ)
    
  
  
    
      0
      Lignite reserves
      0.022
      0.829
    
    
      1
      Lignite reserves
      0.275
      0.934
    
    
      2
      Lignite reserves
      0.021
      1.145
    
    
      3
      Lignite reserves
      0.114
      1.255
    
    
      4
      Lignite reserves
      0.000
      1.356



In [7]:

    
coal_df.tail()









    Out[7]:






  
    
      
      Resource
      Quantity (ZJ)
      Cost (2010$/GJ)
    
  
  
    
      69
      Lignite resources
      0.081
      9.814
    
    
      70
      Lignite resources
      1.075
      9.980
    
    
      71
      Lignite resources
      0.057
      10.442
    
    
      72
      Hard coal resources
      0.162
      11.651
    
    
      73
      Lignite resources
      0.020
      14.497



In [20]:

    
names = coal_df['Resource'].values
amount = coal_df['Quantity (ZJ)'].values
cost = coal_df['Cost (2010$/GJ)'].values

Create a set of names to use for assigning colors and creating the legend

I'm not being picky about the order of colors.



In [21]:

    
name_set = set(names)
name_set









    Out[21]:





{u'Hard coal reserves',
 u'Hard coal resources',
 u'Lignite reserves',
 u'Lignite resources'}



In [22]:

    
color_dict = {}

for i, area in enumerate(name_set): 
    color_dict[area] = i #Assigning index position as value to resource name keys



In [23]:

    
color_dict









    Out[23]:





{u'Hard coal reserves': 3,
 u'Hard coal resources': 0,
 u'Lignite reserves': 2,
 u'Lignite resources': 1}



In [24]:

    
sns.color_palette('deep', n_colors=4, desat=.8)









    Out[24]:





[(0.33725490196078434, 0.45647058823529396, 0.6509803921568628),
 (0.36588235294117644, 0.6262745098039216, 0.4254901960784314),
 (0.7223529411764706, 0.35215686274509816, 0.36470588235294116),
 (0.5192156862745098, 0.47215686274509805, 0.6729411764705882)]



In [25]:

    
sns.palplot(sns.color_palette('deep', n_colors=4, desat=.8))

Define a function that returns the integer color choice based on the region name

Use the function color_match to create a Series with rgb colors that will be used for each box in the figure. Do this using the map operation, which applies a function to each element in a Pandas Series.



In [26]:

    
def color_match(name):
    return sns.color_palette('deep', n_colors=4, desat=.8)[color_dict[name]]

color has rgb values for each resource



In [28]:

    
color = coal_df['Resource'].map(color_match)



In [29]:

    
color.head()









    Out[29]:





0    (0.722352941176, 0.352156862745, 0.364705882353)
1    (0.722352941176, 0.352156862745, 0.364705882353)
2    (0.722352941176, 0.352156862745, 0.364705882353)
3    (0.722352941176, 0.352156862745, 0.364705882353)
4    (0.722352941176, 0.352156862745, 0.364705882353)
Name: Resource, dtype: object

Define the edges of the patch objects that will be drawn on the plot



In [30]:

    
# get the corners of the rectangles for the histogram
left = np.cumsum(np.insert(amount, 0, 0))
right = np.cumsum(np.append(amount, .01))
bottom = np.zeros(len(left))
top = np.append(cost, 0)

Make the figure (coal)



In [37]:

    
sns.set_style('whitegrid')
fig, ax = plt.subplots(figsize=(10,5))

# we need a (numrects x numsides x 2) numpy array for the path helper
# function to build a compound path
for i, name in enumerate(names):
    XY = np.array([[left[i:i+1], left[i:i+1], right[i:i+1], right[i:i+1]], 
                      [bottom[i:i+1], top[i:i+1], top[i:i+1], bottom[i:i+1]]]).T

    # get the Path object
    barpath = path.Path.make_compound_path_from_polys(XY)

    # make a patch out of it (a patch is the shape drawn on the plot)
    patch = patches.PathPatch(barpath, facecolor=color[i], ec='0.2')
    ax.add_patch(patch)
    
#Create patch elements for a custom legend
#The legend function expects multiple patch elements as a list
patch = [patches.Patch(color=sns.color_palette('deep', 4, 0.8)[color_dict[i]], label=i)
         for i in color_dict]

# Axis labels/limits, remove horizontal gridlines, etc
plt.ylabel('Cost (2010$/GJ)', size=14)
plt.xlabel('Quantity (ZJ)', size=14)
ax.set_xlim(left[0], right[-1])
ax.set_ylim(bottom.min(), 12)
ax.yaxis.grid(False)
ax.xaxis.grid(False)

#remove top and right spines (box lines around figure)
sns.despine() 

#Add the custom legend 
plt.legend(handles=patch, loc=2, fontsize=12)
plt.savefig('Example Supply Curve (coal).png')

Load oil data

Data is from:

McGlade, C & Ekins, P. The geographical distribution of fossil fuels unused when limiting global warming to 2 °C. Nature 517, 187–190. (2015) doi:10.1038/nature14016

I'm using data from Figure 1a.



In [11]:

    
fn = 'nature14016-f1.xlsx'
sn = 'Oil data'



In [12]:

    
df = pd.read_excel(fn, sn)

Fortuntely the Cost values are already sorted in ascending order. Cost will be on the y-axis, and cumulative recoverable resources will be on the x-axis.



In [13]:

    
df.head()









    Out[13]:






  
    
      
      Resource
      Quantity (Gb)
      Cost (2010$/bbl)
    
  
  
    
      0
      Conventional 2P reserves in production or sche...
      219.191
      11.730
    
    
      1
      NGL
      46.493
      11.731
    
    
      2
      Conventional 2P reserves in production or sche...
      4.599
      12.602
    
    
      3
      NGL
      1.345
      12.603
    
    
      4
      Conventional 2P reserves in production or sche...
      131.514
      14.304



In [14]:

    
df.tail()









    Out[14]:






  
    
      
      Resource
      Quantity (Gb)
      Cost (2010$/bbl)
    
  
  
    
      322
      In situ kerogen oil
      23.538
      98.038
    
    
      323
      Undiscovered
      0.173
      99.030
    
    
      324
      NGL
      0.040
      99.031
    
    
      325
      Arctic
      13.175
      108.030
    
    
      326
      NGL
      6.896
      108.031

Create arrays of values with easy to type names



In [15]:

    
names = df['Resource'].values
amount = df['Quantity (Gb)'].values
cost = df['Cost (2010$/bbl)'].values

Create a set of names to use for assigning colors and creating the legend

I'm not being picky about the order of colors.



In [16]:

    
name_set = set(names)
name_set









    Out[16]:





{u'Arctic',
 u'Conventional 2P reserves in production or scheduled',
 u'Extra-heavy',
 u'In situ kerogen oil',
 u'In situ natural bitumen',
 u'LTO',
 u'Mined kerogen oil',
 u'Mined natural bitumen',
 u'NGL',
 u'Reserve Growth',
 u'Undiscovered'}



In [17]:

    
color_dict = {}

for i, area in enumerate(name_set): 
    color_dict[area] = i #Assigning index position as value to resource name keys



In [18]:

    
color_dict









    Out[18]:





{u'Arctic': 3,
 u'Conventional 2P reserves in production or scheduled': 7,
 u'Extra-heavy': 4,
 u'In situ kerogen oil': 0,
 u'In situ natural bitumen': 5,
 u'LTO': 9,
 u'Mined kerogen oil': 8,
 u'Mined natural bitumen': 10,
 u'NGL': 2,
 u'Reserve Growth': 6,
 u'Undiscovered': 1}



In [53]:

    
sns.palplot(Paired_11.mpl_colors)

Define a function that returns the integer color choice based on the region name

Use the function color_match to create a Series with rgb colors that will be used for each box in the figure. Do this using the map operation, which applies a function to each element in a Pandas Series.



In [119]:

    
def color_match(name):
    return Paired_11.mpl_colors[color_dict[name]]



In [102]:

    
def color_match(name):
    return sns.husl_palette(n_colors=11, h=0.1, s=0.9, l=0.6)[color_dict[name]]



In [143]:

    
color_match('NGL')









    Out[143]:





(0.6980392156862745, 0.8745098039215686, 0.5411764705882353)



In [120]:

    
color = df['Resource'].map(color_match)

Define the edges of the patch objects that will be drawn on the plot



In [87]:

    
# get the corners of the rectangles for the histogram
left = np.cumsum(np.insert(amount, 0, 0))
right = np.cumsum(np.append(amount, .01))
bottom = np.zeros(len(left))
top = np.append(cost, 0)

Make the figure



In [145]:

    
sns.set_style('whitegrid')
fig, ax = plt.subplots(figsize=(10,5))

# we need a (numrects x numsides x 2) numpy array for the path helper
# function to build a compound path
for i, name in enumerate(names):
    XY = np.array([[left[i:i+1], left[i:i+1], right[i:i+1], right[i:i+1]], 
                      [bottom[i:i+1], top[i:i+1], top[i:i+1], bottom[i:i+1]]]).T

    # get the Path object
    barpath = path.Path.make_compound_path_from_polys(XY)

    # make a patch out of it (a patch is the shape drawn on the plot)
    patch = patches.PathPatch(barpath, facecolor=color[i], ec='0.8')
    ax.add_patch(patch)
    
#Create patch elements for a custom legend
#The legend function expects multiple patch elements as a list
patch = []
for i in color_dict:
    patch.append(patches.Patch(color=Paired_11.mpl_colors[color_dict[i]], 
                                label=i))

# Axis labels/limits, remove horizontal gridlines, etc
plt.ylabel('Cost (2010$/bbl)', size=14)
plt.xlabel('Quantity (Gb)', size=14)
ax.set_xlim(left[0], right[-2])
ax.set_ylim(bottom.min(), 120)
ax.yaxis.grid(False)
ax.xaxis.grid(False)

#remove top and right spines (box lines around figure)
sns.despine() 

#Add the custom legend 
plt.legend(handles=patch, loc=2, fontsize=12,
          ncol=2)
plt.savefig('Example Supply Curve.png')



In [ ]:

	Resource	Quantity (ZJ)	Cost (2010$/GJ)
0	Lignite reserves	0.022	0.829
1	Lignite reserves	0.275	0.934
2	Lignite reserves	0.021	1.145
3	Lignite reserves	0.114	1.255
4	Lignite reserves	0.000	1.356

	Resource	Quantity (ZJ)	Cost (2010$/GJ)
69	Lignite resources	0.081	9.814
70	Lignite resources	1.075	9.980
71	Lignite resources	0.057	10.442
72	Hard coal resources	0.162	11.651
73	Lignite resources	0.020	14.497

	Resource	Quantity (Gb)	Cost (2010$/bbl)
0	Conventional 2P reserves in production or sche...	219.191	11.730
1	NGL	46.493	11.731
2	Conventional 2P reserves in production or sche...	4.599	12.602
3	NGL	1.345	12.603
4	Conventional 2P reserves in production or sche...	131.514	14.304

	Resource	Quantity (Gb)	Cost (2010$/bbl)
322	In situ kerogen oil	23.538	98.038
323	Undiscovered	0.173	99.030
324	NGL	0.040	99.031
325	Arctic	13.175	108.030
326	NGL	6.896	108.031