In [1]:
%matplotlib inline
import pandas as pd
import seaborn as sn
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
Our current sea-salt correction methodology for Mg, Ca and SO4 assumes (i) that all chloride is marine, and (ii) that no fractionation takes place between evaporation and deposition. These coarse assumptions work OK in many regions, but lead to negative values for "corrected" Mg* and SO4* in some of the Irish lakes. See the e-mail from Julian received 20.06.2019 at 16.22 for further details.
Apparently some labs tend to significantly overestimate chloride and Julian has suggested using Na as a tracer instead (subject to the caveats in Julian's e-mail). This notebook compares results obtained using correction methods based on Cl versus Na.
As a starting point, I've extracted all the ICPW data for the Irish lakes from 1990 to the present for Ca, Mg and SO4 (the three parameters we want to correct), plus Cl and Na.
In [2]:
# Read data
xl_path = r'../../../Thematic_Trends_Report_2019/ireland_high_chloride.xlsx'
df = pd.read_excel(xl_path, sheet_name='DATA')
# Suspect that values of *exactly* zero are errors
df[df==0] = np.nan
df.head()
Out[2]:
In [3]:
data = {'par': ['SO4', 'Mg', 'Ca', 'Na', 'Cl'],
'mol_mass': [96.06, 24.31, 40.08, 22.99, 35.45],
'valency': [2, 2, 2, 1, 1],
'sw_mgpl': [2700, 1290, 400, 10800, 19374]}
sw_df = pd.DataFrame(data)
sw_df['sw_ueqpl'] = 1000 * sw_df['valency'] * sw_df['sw_mgpl'] / sw_df['mol_mass']
sw_df
Out[3]:
As expected, the $\mu eq/l$ concentrations of Na and Cl in sea water are roughly the same, but the ratio is not exactly 1:1. For the sea-salt correction, we need the ratio of SO4, Ca and Mg to Na and to Cl.
In [4]:
corr_facs = {}
for par in ['SO4', 'Ca', 'Mg']:
sw_par = sw_df.query('par == @par')['sw_ueqpl'].values[0]
sw_cl = sw_df.query('par == "Cl"')['sw_ueqpl'].values[0]
sw_na = sw_df.query('par == "Na"')['sw_ueqpl'].values[0]
ratio_cl = sw_par / sw_cl
ratio_na = sw_par / sw_na
corr_facs['%s2cl' % par.lower()] = ratio_cl
corr_facs['%s2na' % par.lower()] = ratio_na
print(f'{par:3}:Cl {ratio_cl:.3f}')
print(f'{par:3}:Na {ratio_na:.3f}')
print('')
The ratios above for chloride are virtually the same as documented in our current workflow (here), so I assume the new ratios for Na should also be compatible. Note that the ratio of SO4:Na is about 20% higher than SO4:Cl, which might actually exacerbate the problem of negative values.
In [5]:
fig, axes = plt.subplots(nrows=5, ncols=4, figsize=(20,20))
axes = axes.flatten()
for idx, stn in enumerate(df['station_code'].unique()):
df_stn = df.query('station_code == @stn')
axes[idx].plot(df_stn['ECl'], df_stn['ENa'], 'ro', label='Raw data')
axes[idx].plot(df_stn['ECl'], df_stn['ECl'], 'k-', label='1:1 line')
axes[idx].set_title(stn)
axes[idx].set_xlabel('ECl (ueq/l)')
axes[idx].set_ylabel('ENa (ueq/l)')
axes[idx].legend(loc='best')
plt.tight_layout()
Based on these plots, I'd say concentrations of Na are also pretty high in these lakes (although in most cases Cl is even higher). I suspect that using Na as a marine "tracer" instead of Cl will still lead to negative values.
Using our original methodology, the corrected series for SO4* has the most negative values. The code below compares boxplots of SO4* calculated using Cl (red boxes) versus Na (blue boxes) for each site.
In [6]:
# Par of interest
par = 'ESO4'
df['%s*_Na' % par] = df[par] - (corr_facs['%s2na' % par[1:].lower()] * df['ENa'])
df['%s*_Cl' % par] = df[par] - (corr_facs['%s2cl' % par[1:].lower()] * df['ECl'])
df2 = df[['station_code', '%s*_Cl' % par, '%s*_Na' % par]]
df2 = df2.melt(id_vars=['station_code'], var_name='corr_method')
df2 = df2.dropna(how='any').reset_index(drop=True)
g = sn.catplot(x='corr_method',
y='value',
data=df2,
col='station_code',
col_wrap=4,
kind='box',
sharex=False,
sharey=False,
)
g.map(plt.axhline, y=0, lw=2, ls='--', c='k', alpha=0.4)
g.set(ylabel='%s* (ueq/l)' % par)
plt.tight_layout()
Overall, the results using Na are not much different from Cl: using Na marginally improves the situation at some locations, but actually makes it slightly worse at others. In general, it seems that Na and Cl are both high at these stations, so either the labs have a tendency to overestimate both parameters, or else the lakes really are just surprisingly "salty" compared to other parameters.
Some of these results seem quite extreme: at station IE12 (Galway, Bofin, Mid Lake), for example, more than half of ESO4* values are negative, while at IE17 (Galway, Fadda, Mid Lake) more than 75% of the "corrected" values are less than zero (regardless of whether Na or Cl is used as a tracer). I wonder if there's anything unusual about these two lakes in particular that could give a clue as to what's happening elsewhere?