In [1]:

    
%matplotlib inline
import pandas as pd
import seaborn as sn
import numpy as np
import matplotlib.pyplot as plt

plt.style.use('ggplot')

Alternative sea-salt correction methodology for Ireland

Our current sea-salt correction methodology for Mg, Ca and SO4 assumes (i) that all chloride is marine, and (ii) that no fractionation takes place between evaporation and deposition. These coarse assumptions work OK in many regions, but lead to negative values for "corrected" Mg* and SO4* in some of the Irish lakes. See the e-mail from Julian received 20.06.2019 at 16.22 for further details.

Apparently some labs tend to significantly overestimate chloride and Julian has suggested using Na as a tracer instead (subject to the caveats in Julian's e-mail). This notebook compares results obtained using correction methods based on Cl versus Na.

As a starting point, I've extracted all the ICPW data for the Irish lakes from 1990 to the present for Ca, Mg and SO4 (the three parameters we want to correct), plus Cl and Na.

1. Raw dataset



In [2]:

    
# Read data
xl_path = r'../../../Thematic_Trends_Report_2019/ireland_high_chloride.xlsx'
df = pd.read_excel(xl_path, sheet_name='DATA')

# Suspect that values of *exactly* zero are errors
df[df==0] = np.nan

df.head()









    Out[2]:







  
    
      
      station_id
      station_code
      station_name
      date
      Ca
      Mg
      SO4
      Cl
      Na
      ECa
      EMg
      ESO4
      ECl
      ENa
      ECa*
      EMg*
      ESO4*
      ENa*
    
  
  
    
      0
      23552
      IE01
      Wicklow, Glendalough, Lake Upper, Mid Lake
      1990-03-08
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      1
      23552
      IE01
      Wicklow, Glendalough, Lake Upper, Mid Lake
      1991-10-05
      1.61
      0.96
      NaN
      NaN
      5.83
      80.339
      78.973
      NaN
      NaN
      253.591
      NaN
      NaN
      NaN
      NaN
    
    
      2
      23552
      IE01
      Wicklow, Glendalough, Lake Upper, Mid Lake
      1992-06-11
      1.67
      0.68
      4.13
      6.7
      6.38
      83.333
      55.939
      85.986
      188.983
      277.514
      76.341
      18.899
      66.521
      115.178
    
    
      3
      23552
      IE01
      Wicklow, Glendalough, Lake Upper, Mid Lake
      1992-10-28
      1.5
      0.71
      3.46
      6.59
      4.33
      74.850
      58.407
      72.037
      185.880
      188.344
      67.973
      21.975
      52.891
      28.674
    
    
      4
      23552
      IE01
      Wicklow, Glendalough, Lake Upper, Mid Lake
      1993-03-04
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN

2. Reference values for sea water

The numbers below are taken from the World Data Centre for Precipitation Chemistry (WDCPC; PDF here), except for Ca, which I've taken from here.



In [3]:

    
data = {'par':      ['SO4', 'Mg', 'Ca', 'Na', 'Cl'],
        'mol_mass': [96.06, 24.31, 40.08, 22.99, 35.45],
        'valency':  [2, 2, 2, 1, 1],
        'sw_mgpl':  [2700, 1290, 400, 10800, 19374]}

sw_df = pd.DataFrame(data)

sw_df['sw_ueqpl'] = 1000 * sw_df['valency'] * sw_df['sw_mgpl'] / sw_df['mol_mass']

sw_df









    Out[3]:







  
    
      
      par
      mol_mass
      valency
      sw_mgpl
      sw_ueqpl
    
  
  
    
      0
      SO4
      96.06
      2
      2700
      56214.865709
    
    
      1
      Mg
      24.31
      2
      1290
      106129.164953
    
    
      2
      Ca
      40.08
      2
      400
      19960.079840
    
    
      3
      Na
      22.99
      1
      10800
      469769.464985
    
    
      4
      Cl
      35.45
      1
      19374
      546516.220028

As expected, the $\mu eq/l$ concentrations of Na and Cl in sea water are roughly the same, but the ratio is not exactly 1:1. For the sea-salt correction, we need the ratio of SO4, Ca and Mg to Na and to Cl.



In [4]:

    
corr_facs = {}
for par in ['SO4', 'Ca', 'Mg']:
    sw_par = sw_df.query('par == @par')['sw_ueqpl'].values[0]
    sw_cl = sw_df.query('par == "Cl"')['sw_ueqpl'].values[0]
    sw_na = sw_df.query('par == "Na"')['sw_ueqpl'].values[0]
    
    ratio_cl = sw_par / sw_cl
    ratio_na = sw_par / sw_na
    
    corr_facs['%s2cl' % par.lower()] = ratio_cl
    corr_facs['%s2na' % par.lower()] = ratio_na
    
    print(f'{par:3}:Cl   {ratio_cl:.3f}')
    print(f'{par:3}:Na   {ratio_na:.3f}')
    print('')









    



SO4:Cl   0.103
SO4:Na   0.120

Ca :Cl   0.037
Ca :Na   0.042

Mg :Cl   0.194
Mg :Na   0.226

The ratios above for chloride are virtually the same as documented in our current workflow (here), so I assume the new ratios for Na should also be compatible. Note that the ratio of SO4:Na is about 20% higher than SO4:Cl, which might actually exacerbate the problem of negative values.

2. Compare Cl to Na in Irish lakes



In [5]:

    
fig, axes = plt.subplots(nrows=5, ncols=4, figsize=(20,20))
axes = axes.flatten()

for idx, stn in enumerate(df['station_code'].unique()):
    df_stn = df.query('station_code == @stn')
    axes[idx].plot(df_stn['ECl'], df_stn['ENa'], 'ro', label='Raw data')
    axes[idx].plot(df_stn['ECl'], df_stn['ECl'], 'k-', label='1:1 line')
    
    axes[idx].set_title(stn)
    axes[idx].set_xlabel('ECl (ueq/l)')
    axes[idx].set_ylabel('ENa (ueq/l)')
    axes[idx].legend(loc='best')
    
plt.tight_layout()

Based on these plots, I'd say concentrations of Na are also pretty high in these lakes (although in most cases Cl is even higher). I suspect that using Na as a marine "tracer" instead of Cl will still lead to negative values.

4. Sea-salt correction for SO4

Using our original methodology, the corrected series for SO4* has the most negative values. The code below compares boxplots of SO4* calculated using Cl (red boxes) versus Na (blue boxes) for each site.



In [6]:

    
# Par of interest
par = 'ESO4'

df['%s*_Na' % par] = df[par] - (corr_facs['%s2na' % par[1:].lower()] * df['ENa'])
df['%s*_Cl' % par] = df[par] - (corr_facs['%s2cl' % par[1:].lower()] * df['ECl'])

df2 = df[['station_code', '%s*_Cl' % par, '%s*_Na' % par]]

df2 = df2.melt(id_vars=['station_code'], var_name='corr_method')
df2 = df2.dropna(how='any').reset_index(drop=True)

g = sn.catplot(x='corr_method',
               y='value',
               data=df2,
               col='station_code',
               col_wrap=4,
               kind='box',
               sharex=False,
               sharey=False,
              )

g.map(plt.axhline, y=0, lw=2, ls='--', c='k', alpha=0.4)
g.set(ylabel='%s* (ueq/l)' % par)
plt.tight_layout()

Overall, the results using Na are not much different from Cl: using Na marginally improves the situation at some locations, but actually makes it slightly worse at others. In general, it seems that Na and Cl are both high at these stations, so either the labs have a tendency to overestimate both parameters, or else the lakes really are just surprisingly "salty" compared to other parameters.

Some of these results seem quite extreme: at station IE12 (Galway, Bofin, Mid Lake), for example, more than half of ESO4* values are negative, while at IE17 (Galway, Fadda, Mid Lake) more than 75% of the "corrected" values are less than zero (regardless of whether Na or Cl is used as a tracer). I wonder if there's anything unusual about these two lakes in particular that could give a clue as to what's happening elsewhere?

	station_id	station_code	station_name	date	Ca	Mg	SO4	Cl	Na	ECa	EMg	ESO4	ECl	ENa	ECa*	EMg*	ESO4*	ENa*
0	23552	IE01	Wicklow, Glendalough, Lake Upper, Mid Lake	1990-03-08	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	23552	IE01	Wicklow, Glendalough, Lake Upper, Mid Lake	1991-10-05	1.61	0.96	NaN	NaN	5.83	80.339	78.973	NaN	NaN	253.591	NaN	NaN	NaN	NaN
2	23552	IE01	Wicklow, Glendalough, Lake Upper, Mid Lake	1992-06-11	1.67	0.68	4.13	6.7	6.38	83.333	55.939	85.986	188.983	277.514	76.341	18.899	66.521	115.178
3	23552	IE01	Wicklow, Glendalough, Lake Upper, Mid Lake	1992-10-28	1.5	0.71	3.46	6.59	4.33	74.850	58.407	72.037	185.880	188.344	67.973	21.975	52.891	28.674
4	23552	IE01	Wicklow, Glendalough, Lake Upper, Mid Lake	1993-03-04	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

	par	mol_mass	valency	sw_mgpl	sw_ueqpl
0	SO4	96.06	2	2700	56214.865709
1	Mg	24.31	2	1290	106129.164953
2	Ca	40.08	2	400	19960.079840
3	Na	22.99	1	10800	469769.464985
4	Cl	35.45	1	19374	546516.220028