In [1]:
%matplotlib inline
import matplotlib.pyplot as plt, seaborn as sn, mpld3
import pandas as pd, imp, glob, os, numpy as np
from sqlalchemy import create_engine
sn.set_context('notebook')

TOC trends October 2016 (part 2)

This notebook continues the work described here, where my latest trends code was modified and tested. My aim here is to use the code to generate trends results for three key periods of interest (see e-mail from Heleen 19/10/2016 at 10:24):

  • 1990-2012
  • 1990-2004
  • 1998-2012

as well as creating a fourth set of results using all of the data available for each site. This latter set of results will hopefully help to identify any further obvious data issues (strange values etc.).

Update 27/10/2016: A few additional modifications have been made to the trends code, so that it now includes the number of non-missing values in the first and last 5 years. See e-mail from Heleen received 25/10/2016 at 15:56 for details. The code now performs the following additional calculations:

  • If start and/or end years are specified, the output includes the columns n_start and n_end, which specify the number of non-null values within 5 years of the start and end years, respectively.

  • If start and/or end years are not specified, the n_start and n_end columns record the number of non-null values within 5 years of the start and end of the data series.

These changes do not affect the plotting code, so I haven't re-generated the plots, but I have replaced the various output spreadsheets.

1. Import functions and specify user input


In [2]:
# Import custom functions
# Connect to db
resa2_basic_path = (r'C:\Data\James_Work\Staff\Heleen_d_W\ICP_Waters\Upload_Template'
                    r'\useful_resa2_code.py')

resa2_basic = imp.load_source('useful_resa2_code', resa2_basic_path)

engine, conn = resa2_basic.connect_to_resa2()

# Import code for trends analysis
resa2_trends_path = (r'C:\Data\James_Work\Staff\Heleen_d_W\ICP_Waters\TOC_Trends_Analysis_2015'
                     r'\Python\icpw\toc_trends_analysis.py')

resa2_trends = imp.load_source('toc_trends_analysis', resa2_trends_path)

In [3]:
# User input
# Specify projects of interest
proj_list = ['ICPW_TOCTRENDS_2015_CA_ATL',
             'ICPW_TOCTRENDS_2015_CA_DO',
             'ICPW_TOCTRENDS_2015_CA_ICPW',
             'ICPW_TOCTRENDS_2015_CA_NF',
             'ICPW_TOCTRENDS_2015_CA_QU',
             'ICPW_TOCTRENDS_2015_CZ',
             'ICPW_TOCTRENDS_2015_Cz2',
             'ICPW_TOCTRENDS_2015_FI',
             'ICPW_TOCTRENDS_2015_NO',
             'ICPW_TOCTRENDS_2015_SE',
             'ICPW_TOCTRENDS_2015_UK',
             'ICPW_TOCTRENDS_2015_US_LTM']

# Specify results folder
res_fold = (r'C:\Data\James_Work\Staff\Heleen_d_W\ICP_Waters\TOC_Trends_Analysis_2015'
            r'\Results')

2. 1990 to 2012


In [4]:
# Run analysis

# Specify period of interest
st_yr, end_yr = 1990, 2012

# Build output paths
plot_fold = os.path.join(res_fold, 'trends_plots_%s-%s' % (st_yr, end_yr))
res_csv = os.path.join(res_fold, 'res_%s-%s.csv' % (st_yr, end_yr))
dup_csv = os.path.join(res_fold, 'dup_%s-%s.csv' % (st_yr, end_yr))
nd_csv = os.path.join(res_fold, 'nd_%s-%s.csv' % (st_yr, end_yr))

# Run analysis 
res_df, dup_df, nd_df = resa2_trends.run_trend_analysis(proj_list, engine,
                                                        st_yr=st_yr, end_yr=end_yr,
                                                        plot=False, fold=False)

# Delete mk_std_dev col as not relevant here
del res_df['mk_std_dev']

# Write output
res_df.to_csv(res_csv, index=False)
dup_df.to_csv(dup_csv, index=False)
nd_df.to_csv(nd_csv, index=False)


Extracting data from RESA2...
    The database contains duplicate values for some station-date-parameter combinations.
    Only the most recent values will be used, but you should check the repeated values are not errors.
    The duplicated entries are returned in a separate dataframe.

    Some stations have no relevant data in the period specified. Their IDs are returned in a separate dataframe.

    Done.

Converting units and applying sea-salt correction...
    Done.

Calculating statistics...
    Data series for Al at site 101 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 102 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 103 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 104 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 119 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 128 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 147 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 158 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 173 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 185 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 193 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23456 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23457 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23458 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23460 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4 at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECl at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ENO3 at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4X at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4_ECl at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECa_EMg at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECaX_EMgX at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ANC at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23545 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23546 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36547 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36560 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36565 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36733 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36739 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36750 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36753 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36793 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36797 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 37835 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Done.

Finished.

3. 1990 to 2004


In [5]:
# Run analysis

# Specify period of interest
st_yr, end_yr = 1990, 2004

# Build output paths
plot_fold = os.path.join(res_fold, 'trends_plots_%s-%s' % (st_yr, end_yr))
res_csv = os.path.join(res_fold, 'res_%s-%s.csv' % (st_yr, end_yr))
dup_csv = os.path.join(res_fold, 'dup_%s-%s.csv' % (st_yr, end_yr))
nd_csv = os.path.join(res_fold, 'nd_%s-%s.csv' % (st_yr, end_yr))

# Run analysis 
res_df, dup_df, nd_df = resa2_trends.run_trend_analysis(proj_list, engine,
                                                        st_yr=st_yr, end_yr=end_yr,
                                                        plot=False, fold=False)

# Delete mk_std_dev col as not relevant here
del res_df['mk_std_dev']

# Write output
res_df.to_csv(res_csv, index=False)
dup_df.to_csv(dup_csv, index=False)
nd_df.to_csv(nd_csv, index=False)


Extracting data from RESA2...
    The database contains duplicate values for some station-date-parameter combinations.
    Only the most recent values will be used, but you should check the repeated values are not errors.
    The duplicated entries are returned in a separate dataframe.

    Some stations have no relevant data in the period specified. Their IDs are returned in a separate dataframe.

    Done.

Converting units and applying sea-salt correction...
    Done.

Calculating statistics...
    Data series for Al at site 23456 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23457 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23458 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23459 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23460 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4 at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECl at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ENO3 at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4X at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4_ECl at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECa_EMg at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECaX_EMgX at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ANC at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23542 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23545 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23546 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23547 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23548 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4 at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECl at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ENO3 at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4X at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4_ECl at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECa_EMg at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECaX_EMgX at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ANC at site 23717 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36455 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36547 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36550 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36551 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36555 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36556 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36560 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36565 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36575 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36578 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36584 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36588 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36592 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36636 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36660 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36670 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36675 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36680 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36690 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36711 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36723 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36731 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36788 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36797 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36813 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36825 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36826 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ENO3 at site 36853 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ANC at site 36853 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4 at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECl at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ENO3 at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4X at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ESO4_ECl at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECa_EMg at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ECaX_EMgX at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for ANC at site 36859 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 37044 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Done.

Finished.

4. 1998 to 2012


In [6]:
# Run analysis

# Specify period of interest
st_yr, end_yr = 1998, 2012

# Build output paths
plot_fold = os.path.join(res_fold, 'trends_plots_%s-%s' % (st_yr, end_yr))
res_csv = os.path.join(res_fold, 'res_%s-%s.csv' % (st_yr, end_yr))
dup_csv = os.path.join(res_fold, 'dup_%s-%s.csv' % (st_yr, end_yr))
nd_csv = os.path.join(res_fold, 'nd_%s-%s.csv' % (st_yr, end_yr))

# Run analysis 
res_df, dup_df, nd_df = resa2_trends.run_trend_analysis(proj_list, engine,
                                                        st_yr=st_yr, end_yr=end_yr,
                                                        plot=False, fold=False)

# Delete mk_std_dev col as not relevant here
del res_df['mk_std_dev']

# Write output
res_df.to_csv(res_csv, index=False)
dup_df.to_csv(dup_csv, index=False)
nd_df.to_csv(nd_csv, index=False)


Extracting data from RESA2...
    The database contains duplicate values for some station-date-parameter combinations.
    Only the most recent values will be used, but you should check the repeated values are not errors.
    The duplicated entries are returned in a separate dataframe.

    Some stations have no relevant data in the period specified. Their IDs are returned in a separate dataframe.

    Done.

Converting units and applying sea-salt correction...
    Done.

Calculating statistics...
    Data series for Al at site 101 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 102 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 103 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 104 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 119 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 128 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 147 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 158 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 173 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 185 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 193 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23456 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23457 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23458 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23459 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23460 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 23466 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23545 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23546 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23547 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23548 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36455 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36547 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36558 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36560 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36565 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36566 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36733 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36739 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36750 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36753 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36793 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36797 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for TOC at site 37835 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Done.

Finished.

5. All data


In [7]:
# Run analysis

# Specify period of interest
st_yr, end_yr = None, None

# Build output paths
plot_fold = os.path.join(res_fold, 'trends_plots_all_years')
res_csv = os.path.join(res_fold, 'res_all_years.csv')
dup_csv = os.path.join(res_fold, 'dup_all_years.csv')
nd_csv = os.path.join(res_fold, 'nd_all_years.csv')

# Run analysis 
res_df, dup_df, nd_df = resa2_trends.run_trend_analysis(proj_list, engine,
                                                        st_yr=st_yr, end_yr=end_yr,
                                                        plot=False, fold=False)

# Delete mk_std_dev col as not relevant here
del res_df['mk_std_dev']

# Write output
res_df.to_csv(res_csv, index=False)
dup_df.to_csv(dup_csv, index=False)
nd_df.to_csv(nd_csv, index=False)


Extracting data from RESA2...
    The database contains duplicate values for some station-date-parameter combinations.
    Only the most recent values will be used, but you should check the repeated values are not errors.
    The duplicated entries are returned in a separate dataframe.

    Some stations have no relevant data in the period specified. Their IDs are returned in a separate dataframe.

    Done.

Converting units and applying sea-salt correction...
    Done.

Calculating statistics...
    Data series for Al at site 101 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 102 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 103 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 104 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 107 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 109 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 112 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 115 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 118 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 119 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 120 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 121 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 122 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 123 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 128 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 132 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 134 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 135 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 144 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 146 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 147 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 150 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 156 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 158 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 161 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 162 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 163 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 166 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 168 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 170 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 173 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 176 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 179 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 180 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 181 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 182 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 183 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 185 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 192 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 193 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 196 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 12081 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 23468 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 23546 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36547 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 36560 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36733 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36739 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36750 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36753 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36793 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for EH at site 36797 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Data series for Al at site 37063 has fewer than 10 non-null values. Significance estimates may be unreliable.
    Done.

Finished.

6. Basic checking

6.1. Boxplots

As a very basic check, let's create boxplots showing the long-term mean for each parameter at each site (i.e. each datapoint is the mean of all the annual means for a particular parameter at a single site). This should help identify any really extreme values that need further checking and cleaning. Note the following:

  • All values are in $\mu eq/l$, except for Al and TOC, which have units of $\mu g/l$ and $mgC/l$, respectively.

  • The "whiskers" on the boxplots extend from the minimum to the maximum values in each dataset (i.e. they show the full data range, not a percentile interval or a multiple of the IQR).


In [8]:
# Set up plot
fig = plt.figure(figsize=(20,10))
sn.set(style="ticks", palette="muted", 
       color_codes=True, font_scale=2)

# Horizontal boxplots
ax = sn.boxplot(x="mean", y="par_id", data=res_df,
                whis=np.inf, color="c")

# Add "raw" data points for each observation, with some "jitter"
# to make them visible
sn.stripplot(x="mean", y="par_id", data=res_df, jitter=True, 
             size=3, color=".3", linewidth=0)

# Remove axis lines
sn.despine(trim=True)


C:\Data\WinPython-64bit-2.7.10.3\python-2.7.10.amd64\lib\site-packages\seaborn\categorical.py:336: DeprecationWarning: pandas.core.common.is_categorical_dtype is deprecated. import from the public API: pandas.api.types.is_categorical_dtype instead
  elif is_categorical(y):
C:\Data\WinPython-64bit-2.7.10.3\python-2.7.10.amd64\lib\site-packages\seaborn\categorical.py:336: DeprecationWarning: pandas.core.common.is_categorical_dtype is deprecated. import from the public API: pandas.api.types.is_categorical_dtype instead
  elif is_categorical(y):

6.2. Map visualisation

As a further check, I'd like to build an updated map visualisation incorporating all of the results produced above. This requires some merging of the results files created above. I've also manually exported the basic station properties for all the sites associated with the 13 RESA2 projects chosen for this analysis. This file can be found here:

C:\Data\James_Work\Staff\Heleen_d_W\ICP_Waters\TOC_Trends_Analysis_2015\Results\trends_sites_oct_2016.xlsx

Note the information in the readme sheet, which explains that there are 431 sites with data in the selected projects. This is less than in the previous analysis and perhaps less than Heleen was expecting (?). Do we need to review the list of projects under consideration?


In [9]:
# Read results files and concatenate
# Container for data
df_list = []

# Loop over periods
for per in ['1990-2012', '1990-2004', '1998-2012', 'all_years']:
    res_path = os.path.join(res_fold, 'res_%s.csv' % per)
    df = pd.read_csv(res_path)
    
    # Change 'period' col to 'data_period' and add 'analysis_period'
    df['data_period'] = df['period']
    del df['period']
    
    df['analysis_period'] = per
    
    df_list.append(df)
    
# Concat
df = pd.concat(df_list, axis=0)

# Read station data
stn_path = os.path.join(res_fold, 'trends_sites_oct_2016.xlsx')
stn_df = pd.read_excel(stn_path, sheetname='data')

# Join
df = pd.merge(df, stn_df, how='left', on='station_id')

# Re-order columns
df = df[['station_id', 'station_name', 'station_code', 'nfc_code','country', 
         'lat', 'lon', 'analysis_period', 'data_period', 'par_id',
         'non_missing', 'n_start', 'n_end', 'mean', 'median', 'std_dev', 
         'mk_stat', 'norm_mk_stat', 'mk_p_val', 'trend', 'sen_slp']]

df.head()


Out[9]:
station_id station_name station_code nfc_code country lat lon analysis_period data_period par_id ... n_start n_end mean median std_dev mk_stat norm_mk_stat mk_p_val trend sen_slp
0 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 Al ... 0 0 276.433655 276.433655 NaN NaN NaN NaN NaN NaN
1 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 TOC ... 5 5 6.146667 6.000000 1.184066 111.0 3.326214 8.803429e-04 increasing 0.116569
2 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 EH ... 5 5 10.500100 9.772372 3.596191 -119.0 -3.568121 3.595511e-04 decreasing -0.380255
3 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 ESO4 ... 5 5 35.188492 29.166667 15.103219 -173.0 -5.200989 1.982305e-07 decreasing -2.051373
4 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 ECl ... 5 5 18.272109 17.142857 3.824173 -107.0 -3.221999 1.272998e-03 decreasing -0.351074

5 rows × 21 columns

I have uploaded all the trends plots to our web-hosting platform in the following folder:

http://77.104.141.195/~icpwater/wp-content/trends_plots

In order to display these on my map, I need to build a column containing direct links to each of these files.

I also need to add a column defining colours for the three trend types.

Finally, I'm going to drop rows where non_missing = 0 or 1, as this implies there's not enough data to calculate any summary statistics.


In [10]:
def assign_colour(row):
    if row['trend'] == 'increasing':
        return 'small_red'
    elif row['trend'] == 'decreasing':
        return 'small_green'
    else:
        return 'small_yellow'

def build_path(row):
    base = r'http://77.104.141.195/~icpwater/wp-content/trends_plots/'
    
    # Get row properties
    an_per = row['analysis_period']
    stn = row['station_id']
    par = row['par_id']
    da_per = row['data_period']
    
    # Make path
    full_path = os.path.join(base, 
                             'trends_plots_%s' % an_per,
                             '%s_%s_%s.png' % (stn, par, da_per))
    
    return full_path   
        
        
# Add symbol column
df['symbol'] = df.apply(assign_colour, axis=1)

# Build path to plots
df['link'] = df.apply(build_path, axis=1)

# Filter results
df = df.query('(non_missing != 0) and (non_missing != 1)')

# Save
out_path = os.path.join(res_fold, 'data_vis_all.csv')
df.to_csv(out_path, index=False, encoding='utf-8')

df.head()


Out[10]:
station_id station_name station_code nfc_code country lat lon analysis_period data_period par_id ... mean median std_dev mk_stat norm_mk_stat mk_p_val trend sen_slp symbol link
1 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 TOC ... 6.146667 6.000000 1.184066 111.0 3.326214 8.803429e-04 increasing 0.116569 small_red http://77.104.141.195/~icpwater/wp-content/tre...
2 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 EH ... 10.500100 9.772372 3.596191 -119.0 -3.568121 3.595511e-04 decreasing -0.380255 small_green http://77.104.141.195/~icpwater/wp-content/tre...
3 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 ESO4 ... 35.188492 29.166667 15.103219 -173.0 -5.200989 1.982305e-07 decreasing -2.051373 small_green http://77.104.141.195/~icpwater/wp-content/tre...
4 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 ECl ... 18.272109 17.142857 3.824173 -107.0 -3.221999 1.272998e-03 decreasing -0.351074 small_green http://77.104.141.195/~icpwater/wp-content/tre...
5 100 Breidlivatnet 623-603 NaN Norway 59.977669 10.152037 1990-2012 1990-2012 ENO3 ... 2.180272 2.000000 1.152995 74.0 2.206388 2.735684e-02 increasing 0.078759 small_red http://77.104.141.195/~icpwater/wp-content/tre...

5 rows × 23 columns

The workflow for creating the map is as follows:

  1. Run the trends code and the notebook cells above to create a CSV file that will form the basis of the Google Fusion Table (data_vis_all.csv).

  2. Open a blank Excel workbook and choose Data > From text to import the CSV. Be sure to set the encoding to 65001: Unicode (utf-8) and set the column data types explicitly, otherwise Excel will truncate some of the NFC site codes and the special characters in the station names won't reproduce properly. (NB: there are still some problems with special characters in the station names, because in many cases RESA2 is storing names that are already corrupted. I don't have time to fix this now - it's another database issue to add to the list. The workflow described here should faithfully reproduce whatever's in the database, which is the best I can do at present). Check the file looks OK (data_vis_all.xlsx).

  3. Upload all the plots to SiteGround in the wp-content/trends_plots folder. This is done using the FileZilla FTP client.

  4. Create a new Fusion Table and import the data. Make sure to check the box to make the table downloadable, and make sure it's in a public folder on Google Drive. Next, check the column types are correct. In particular, lat and lon need be set to define the locations and the link column needs to set to Text > Link. Then switch to map view, click Change feature styles and style the map based on the entires in the symbol column. Turn on the Terrain option in map view if you want to.

  5. Click Change info window and modify how the pop-up information box is displayed. Entering something like this in the Custom tab is a good start:

    <center><h2>{par_id} at {station_name}, {country}</h2></center>
    <center><h2>{analysis_period}</h2></center>
    <center><table>
      <tr>
        <td><b>ICPW ID:</b></td>
        <td>{station_id}</td>
      </tr>
      <tr>
        <td><b>ICPW code:</b></td>
        <td>{station_code}</td>
      </tr>
      <tr>
        <td><b>NFC code:</b></td>
        <td>{nfc_code}</td>
      </tr>
      <tr>
        <td><b>Data period:</b></td>
        <td>{data_period}</td>
      </tr>
      <tr>
        <td><b>Number of years with data:</b></td>
        <td>{non_missing}</td>
      </tr>
      <tr>
        <td><b>Mean:</b></td>
        <td>{mean}</td>
      </tr>
      <tr>
        <td><b>Median:</b></td>
        <td>{median}</td>
      </tr>
      <tr>
        <td><b>Standard deviation:</b></td>
        <td>{std_dev}</td>
      </tr>
      <tr>
        <td><b>Normalised Mann-Kendall statistic:</b></td>
        <td>{norm_mk_stat}</td>
      </tr>
      <tr>
        <td><b>Mann-Kendall p-value:</b></td>
        <td>{mk_p_val}</td>
      </tr>
      <tr>
        <td><b>Trend:</b></td>
        <td>{trend}</td>
      </tr>
      <tr>
        <td><b>Theil-Sen slope:</b></td>
        <td>{sen_slp}</td>
      </tr>
    </table></center>
    
    <center><img src={link} height="250"></center>
  6. Follow the instructions in this Word document:

    C:\Data\James_Work\Staff\Heleen_d_W\ICP_Waters\TOC_Trends_Analysis_2015\Python\Fusion tables tips.docx

    which describes adding the table to the Fusion Tables Layer Wizard and then modifying the subsequent JavaScript to add e.g. filter boxes, legends etc. Save the resulting code as an .html file and upload it to a suitable public location at SiteGround.

  7. Link to you finished map by embedding the public path to your HTML file as an iframe in your webpage.

As of 21/10/2016, the finished page is here.

7. Data restructuring

Heleen would like the output in a particular format - see e-mailed received 19/10/2016 at 10:24 for details. The code below reads the results files and restructures them.


In [11]:
# Read results files and concatenate
# Container for data
df_list = []

# Loop over periods
for per in ['1990-2012', '1990-2004', '1998-2012', 'all_years']:
    res_path = os.path.join(res_fold, 'res_%s.csv' % per)
    df = pd.read_csv(res_path)
    
    # Change 'period' col to 'data_period' and add 'analysis_period'
    df['data_period'] = df['period']
    del df['period']
    
    df['analysis_period'] = per
    
    df_list.append(df)
    
# Concat
df = pd.concat(df_list, axis=0)

# Read station data
stn_path = os.path.join(res_fold, 'trends_sites_oct_2016.xlsx')
stn_df = pd.read_excel(stn_path, sheetname='data')

# Join
df = pd.merge(df, stn_df, how='left', on='station_id')

# Read projects table
sql = ('SELECT project_id, project_name '
       'FROM resa2.projects '
       'WHERE project_name in %s' % str(tuple(proj_list)))

proj_df = pd.read_sql_query(sql, engine)

# Get associated stations
sql = ('SELECT station_id, project_id '
       'FROM resa2.projects_stations '
       'WHERE project_id in %s' % str(tuple(proj_df['project_id'].values)))

proj_stn_df = pd.read_sql_query(sql, engine)

# Join proj details
proj_df = pd.merge(proj_stn_df, proj_df, how='left', on ='project_id')

# Join to results
df = pd.merge(df, proj_df, how='left', on='station_id')

# Re-order columns
df = df[['project_id', 'project_name', 'country', 'station_id', 
         'station_code', 'station_name', 'nfc_code', 'type',
         'lat', 'lon', 'analysis_period', 'data_period', 'par_id',
         'non_missing', 'n_start', 'n_end', 'mean', 'median', 
         'std_dev', 'mk_stat', 'norm_mk_stat', 'mk_p_val', 'trend',
         'sen_slp']]

df.head()


Out[11]:
project_id project_name country station_id station_code station_name nfc_code type lat lon ... n_start n_end mean median std_dev mk_stat norm_mk_stat mk_p_val trend sen_slp
0 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 0 0 276.433655 276.433655 NaN NaN NaN NaN NaN NaN
1 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 5 6.146667 6.000000 1.184066 111.0 3.326214 8.803429e-04 increasing 0.116569
2 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 5 10.500100 9.772372 3.596191 -119.0 -3.568121 3.595511e-04 decreasing -0.380255
3 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 5 35.188492 29.166667 15.103219 -173.0 -5.200989 1.982305e-07 decreasing -2.051373
4 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 5 18.272109 17.142857 3.824173 -107.0 -3.221999 1.272998e-03 decreasing -0.351074

5 rows × 24 columns

Now add an "include" column based on the criteria in Heleen's e-mail (received 24/10/2016 at 11:23) and save the result.

Updated 27/10/2016: The refined criteria are actually in the e-mail from Heleen received 25/10/2016 at 15:56.


In [12]:
def include(row):
    if ((row['analysis_period'] == '1990-2012') & 
        (row['n_start'] >= 2) &
        (row['n_end'] >= 2) &
        (row['non_missing'] >= 15)):
        return 'yes'
    elif ((row['analysis_period'] == '1990-2004') & 
          (row['n_start'] >= 2) &
          (row['n_end'] >= 2) &          
          (row['non_missing'] >= 10)):
        return 'yes'
    elif ((row['analysis_period'] == '1998-2012') & 
          (row['n_start'] >= 2) &
          (row['n_end'] >= 2) &          
          (row['non_missing'] >= 10)):
        return 'yes'
    else:
        return 'no'

df['include'] = df.apply(include, axis=1)

# Save output
out_path = os.path.join(res_fold, 'toc_trends_long_format.csv')
df.to_csv(out_path, index=False, encoding='utf-8')

df.head()


Out[12]:
project_id project_name country station_id station_code station_name nfc_code type lat lon ... n_end mean median std_dev mk_stat norm_mk_stat mk_p_val trend sen_slp include
0 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 0 276.433655 276.433655 NaN NaN NaN NaN NaN NaN no
1 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 6.146667 6.000000 1.184066 111.0 3.326214 8.803429e-04 increasing 0.116569 yes
2 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 10.500100 9.772372 3.596191 -119.0 -3.568121 3.595511e-04 decreasing -0.380255 yes
3 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 35.188492 29.166667 15.103219 -173.0 -5.200989 1.982305e-07 decreasing -2.051373 yes
4 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 ... 5 18.272109 17.142857 3.824173 -107.0 -3.221999 1.272998e-03 decreasing -0.351074 yes

5 rows × 25 columns

Heleen also wants a version in "wide" format, where each row includes all the data for a single station. I'm going to remove the data_period column, because column headings like Al_1990-2012_1991-2010 are confusing.


In [13]:
del df['data_period']

# Melt to "long" format
melt_df = pd.melt(df, 
                  id_vars=['project_id', 'project_name', 'country',
                           'station_id', 'station_code', 'station_name',
                           'nfc_code', 'type', 'lat', 'lon', 'analysis_period',
                           'par_id', 'include'],
                  var_name='stat')

# Get only values where include='yes'
melt_df = melt_df.query('include == "yes"')
del melt_df['include']

melt_df.head()


Out[13]:
project_id project_name country station_id station_code station_name nfc_code type lat lon analysis_period par_id stat value
1 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 1990-2012 TOC non_missing 21
2 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 1990-2012 EH non_missing 21
3 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 1990-2012 ESO4 non_missing 21
4 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 1990-2012 ECl non_missing 21
5 4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 1990-2012 ENO3 non_missing 21

In [14]:
# Build multi-index on everything except "value"
melt_df.set_index(['project_id', 'project_name', 'country',
                   'station_id', 'station_code', 'station_name',
                   'nfc_code', 'type', 'lat', 'lon', 'par_id', 
                   'analysis_period', 'stat'], inplace=True)

melt_df.head()


Out[14]:
value
project_id project_name country station_id station_code station_name nfc_code type lat lon par_id analysis_period stat
4012 ICPW_TOCTRENDS_2015_NO Norway 100 623-603 Breidlivatnet NaN L 59.977669 10.152037 TOC 1990-2012 non_missing 21
EH 1990-2012 non_missing 21
ESO4 1990-2012 non_missing 21
ECl 1990-2012 non_missing 21
ENO3 1990-2012 non_missing 21

In [15]:
# Unstack levels of interest to columns
wide_df = melt_df.unstack(level=['par_id', 'analysis_period', 'stat'])

# Drop unwanted "value" level in index
wide_df.columns = wide_df.columns.droplevel(0)

# Replace multi-index with separate components concatenated with '_'
wide_df.columns = ["_".join(item) for item in wide_df.columns]

# Reset multiindex on rows
wide_df = wide_df.reset_index()

# Save output
out_path = os.path.join(res_fold, 'toc_trends_wide_format.csv')
wide_df.to_csv(out_path, index=False, encoding='utf-8')

wide_df.head()


Out[15]:
project_id project_name country station_id station_code station_name nfc_code type lat lon ... EH_1998-2012_sen_slp ESO4_1998-2012_sen_slp ECl_1998-2012_sen_slp ENO3_1998-2012_sen_slp ESO4X_1998-2012_sen_slp ESO4_ECl_1998-2012_sen_slp ECa_EMg_1998-2012_sen_slp ECaX_EMgX_1998-2012_sen_slp ANC_1998-2012_sen_slp Al_1998-2012_sen_slp
0 3810 ICPW_TOCTRENDS_2015_FI Finland 23542 FI01 Hirvilampi NaN L 60.7054 27.9181 ... 0 -3.95833 -0.714286 -0.012987 -3.86979 -4.62662 -1.83333 -1.65503 1.39563 0
1 3810 ICPW_TOCTRENDS_2015_FI Finland 23545 FI05 Suopalampi NaN L 67.0655 26.0946 ... -0.00227885 -0.520833 -0.12987 -0.00793651 -0.460108 -0.877976 0 0 0.866033 None
2 3810 ICPW_TOCTRENDS_2015_FI Finland 23546 FI06 Vasikkajärvi NaN L 67.1170 26.0844 ... 0.0234871 -0.892857 -0.103896 -0.0204082 -0.871837 -1.04167 -0.5 -0.454545 0.266308 None
3 3810 ICPW_TOCTRENDS_2015_FI Finland 23547 FI07 Vitsjön NaN L 59.9634 23.3158 ... 0.0048232 -4.64744 -0.15873 0.0663265 -4.56683 -4.97768 -4.09091 -3.91762 0.446737 None
4 3810 ICPW_TOCTRENDS_2015_FI Finland 23548 FI08 Kakkisenlampi NaN L 63.6565 29.9466 ... -0.0327441 -0.9375 0 -0.0867347 -0.934898 -0.9375 -0.606061 -0.625 0.154762 None

5 rows × 373 columns

The two files created above have been read into Excel and saved as toc_trends_long_format.xlsx and toc_trends_wide_format.xlsx, respectively.