The purpose of this notebook is to provide the data and the plotting / analysis scripts that were used in the paper: Sylvester, Z., Pirmez, C., Cantelli, A., & Jobe, Z. R. (2013). Global (latitudinal) variation in submarine channel sinuosity: COMMENT. Geology, 41(5), e287–e287. doi:10.1130/G33548C.1. If you have comments or questions, please send them to zoltan_dot_sylvester_at_gmail_dot_com.
The original article (that is discussed in our comment) and the reply to our comment are as follows:
Peakall, J., Kane, I. A., Masson, D. G., Keevil, G., McCaffrey, W., & Corney, R. (2011). Global (latitudinal) variation in submarine channel sinuosity. Geology, 40(1), 11–14. doi:10.1130/G32295.1.
Peakall, J., Wells, M. G., Cossu, R., Kane, I. A., Masson, D. G., Keevil, G. M., et al. (2013). Global (latitudinal) variation in submarine channel sinuosity: REPLY. Geology, 41(5), e288–e288. doi:10.1130/G34319Y.1.
In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats
%matplotlib inline
pd.set_option('display.mpl_style', 'default') # nicer plots than the default matplotlib style
%config InlineBackend.figure_format = 'svg' # set backend to SVG
In [2]:
fileurl = 'https://dl.dropboxusercontent.com/u/25694950/channel_sinuosities.csv'
data = pd.read_csv(fileurl)
The majority of the data comes from the book by Clark and Pickering (1996), which also forms the basis of the analysis by Peakall et al. (2011); graphs in the book were digitized using Graphclick. Data for the Amazon Channel comes from Pirmez and Flood (1995) (provided by Carlos Pirmez); and Popescu et al. (2001) (centerline digitized from map in paper and half-wavelength sinuosities calculated from centerline). See full references in paper.
Let's have a look at the first 20 lines of the data table:
In [3]:
data[:20]
Out[3]:
A bar graph of data point counts for each system shows that the Amazon Channel has by far the largest number of values:
In [4]:
plt.figure(figsize=(7,5))
data['Name'].value_counts().plot(kind='bar');
Next we plot sinuosity against valley slope and sinuosity against latitude. Data points are colored as a function of they come from systems with larger rivers or smaller rivers. Larger sinuosities tend to be restricted to lower latitudes, as suggested by Peakall et al. (2011); but this is not necessarily caused by stronger Coriolis forces at higher latitudes. One possibility is that no significant sediment volumes reach deeper waters at higher latitudes due to the lack of large rivers that would provide the source.
In [5]:
plt.figure(figsize=(7,5))
groups = data.groupby('Large_river')
labels = ('no large river','large river')
colors = (plt.rcParams['axes.color_cycle'][4],plt.rcParams['axes.color_cycle'][0])
for name, group in groups:
plt.plot(group.Slope, group.Sinuosity, marker='o', linestyle='', ms=6, label=labels[name], color=colors[name])
plt.legend(loc='best', fontsize=10)
plt.xlabel('valley slope', fontsize=12)
plt.ylabel('sinuosity', fontsize=12)
plt.axis([0.0,0.03,1.0,3.0]);
In [7]:
plt.figure(figsize=(7,5))
groups = data.groupby('Large_river')
plt.labels = ('no large river','large river')
plt.colors = (plt.rcParams['axes.color_cycle'][4],plt.rcParams['axes.color_cycle'][0])
for name, group in groups:
plt.plot(group.Latitude, group.Sinuosity, marker='o', linestyle='', ms=6, label=labels[name], color=colors[name])
plt.legend(loc='best', fontsize=10)
plt.xlabel('latitude', fontsize=12)
plt.ylabel('sinuosity', fontsize=12)
plt.axis([0.0,70,1.0,3.0]);
This function is used to do a bootstrap on the sinuosity data and look at the correlations with linear regression:
In [8]:
def bootstrap_regression(x,y,nit):
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
r = np.zeros(nit)
n = len(x)
for i in range(1,nit):
x1 = x[np.random.randint(n, size=n)]
y1 = y[np.random.randint(n, size=n)]
slope, intercept, r_value2, p_value2, std_err = stats.linregress(x1,y1)
r[i] = r_value2
if r_value>0:
p_value_boot = len(r[r >= r_value])/float(len(r))
else:
p_value_boot = len(r[r <= r_value])/float(len(r))
return r_value, p_value, p_value_boot
In [10]:
r_values = np.zeros([3,1])
p_values = np.zeros([3,1])
p_values_boot = np.zeros([3,1])
r_values[0], p_values[0], p_values_boot[0] = bootstrap_regression(data.Latitude,data.Sinuosity,1000)
# for variable pairs that involve slope, we can only use those data points that have slope measurements:
r_values[1], p_values[1], p_values_boot[1] = bootstrap_regression(data.Slope[np.isnan(data.Slope)==0],data.Sinuosity[np.isnan(data.Slope)==0],1000)
r_values[2], p_values[2], p_values_boot[2] = bootstrap_regression(data.Latitude[np.isnan(data.Slope)==0],data.Slope[np.isnan(data.Slope)==0],1000)
Create a dataframe that contains the result of the regression:
In [11]:
df = pd.DataFrame(np.hstack(([[292],[268],[268]],r_values,r_values**2,
p_values,p_values_boot)),columns=['n','Pearson R','R squared','p-value',
'p-value bootstrap'],index=('latitude and sinuosity','slope and sinuosity','latitude and slope'))
In [12]:
df
Out[12]:
In [ ]: