2 Question: Is water availability and water use related to GDP per capita?

3 Our plan

4 To do: Develop investigation themes for exploration

5 Wrangle

5.1 Things to consider doing

5.2 Load the data

5.3 Research the variables

5.4 Describe the panel

5.5 Ways to look at this data

5.6 Slicing

5.6.1 For a given time slice

5.6.2 For a given country

5.6.3 By variable

5.6.4 Time series for given country and variable

5.7 By region

5.8 Exercises

6 To do: Update investigation themes



In [1]:

    
%matplotlib inline
%config InlineBackend.figure_format='retina'
import matplotlib as mpl
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
import os, sys
import warnings
warnings.filterwarnings('ignore')
sns.set_context("poster", font_scale=1.3)

The dataset

We will be using the Food and Agriculture Organization (FAO) of the United Nation's AQUASTAT dataset.

From FAO:

FAO's three main goals are:

The eradication of hunger, food insecurity and malnutrition
The elimination of poverty and the driving forward of economic and social progress for all
The sustainable management and utilization of natural resources, including land, water, air, climate and genetic resources for the benefit of present and future generations.

To support these goals, Article 1 of its constitution requires FAO to "collect, analyse, interpret and disseminate information related to nutrition, food and agriculture". Thus AQUASTAT started, with the aim to contribute to FAO's goals through the collection, analysis and dissemination of information related to water resources, water uses and agricultural water management, with an emphasis on countries in Africa, Asia, Latin America, and the Caribbean.

FAO offers data, metadata, reports, country profiles, river basin profiles, regional analyses, maps, tables, spatial data, guidelines, and other tools on:

Water resources: internal, transboundary, total
Water uses: by sector, by source, wastewater
Irrigation: location, area, typology, technology, crops
Dams: location, height, capacity, surface area
Water-related institutions, policies and legistation

http://www.fao.org/nr/water/aquastat/data/query/index.html

Our plan

Exploratory data analysis consists of the following major tasks, which we present linearly here because each task doesn't make much sense to do without the ones prior to it. However, in reality, you are going to constantly jump around from step to step. You may want to do all the steps for a subset of the variables first. Or often, an observation will bring up a question you want to investigate and you'll branch off and explore to answer that question before returning down the main path of exhaustive EDA.

Form hypotheses/develop investigation themes to explore
Wrangle data
Assess quality of data
Profile data
Explore each individual variable in the dataset
Assess the relationship between each variable and the target
Assess interactions between variables
Explore data across many dimensions

Throughout the entire analysis you want to:

Capture a list of hypotheses and questions that come up for further exploration.
Record things to watch out for/ be aware of in future analyses.
Show intermediate results to colleagues to get a fresh perspective, feedback, domain knowledge. Don't do EDA in a bubble! Get feedback throughout especially from people removed from the problem and/or with relevant domain knowledge.
Position visuals and results together. EDA relies on your natural pattern recognition abilities so maximize what you'll find by putting visualizations and results in close proximity.

To do: Develop investigation themes for exploration

Write down hypotheses, things you need to find out to answer the question.

Wrangle

Things to consider doing

Make your data tidy

Each variable forms a column
Each observation forms a row
Each type of observational unit forms a table

Transform data
Sometimes you will need to transform your data to be able to extract information from it. This step will usually occur after some of the other steps of EDA unless domain knowledge can inform these choices beforehand. Transforms include:

Log: when data is highly skewed (versus normally distributed like a bell curve), sometimes it has a log-normal distribution and taking the log of each data point will normalize it.
Binning of continuous variables: Binning continuous variables and then analyzing the groups of observations created can allow for easier pattern identification. Especially with non-linear relationships.
Simplifying of categories: you really don't want more than 8-10 categories within a single data field. Try to aggregate to higher-level categories when it makes sense.

Load the data



In [2]:

    
data = pd.read_csv('../../data/aquastat/aquastat.csv.gzip', compression='gzip')



In [3]:

    
data.head()









    Out[3]:






  
    
      
      country
      region
      variable
      variable_full
      time_period
      year_measured
      value
    
  
  
    
      0
      Afghanistan
      World | Asia
      total_area
      Total area of the country (1000 ha)
      1958-1962
      1962.0
      65286.0
    
    
      1
      Afghanistan
      World | Asia
      total_area
      Total area of the country (1000 ha)
      1963-1967
      1967.0
      65286.0
    
    
      2
      Afghanistan
      World | Asia
      total_area
      Total area of the country (1000 ha)
      1968-1972
      1972.0
      65286.0
    
    
      3
      Afghanistan
      World | Asia
      total_area
      Total area of the country (1000 ha)
      1973-1977
      1977.0
      65286.0
    
    
      4
      Afghanistan
      World | Asia
      total_area
      Total area of the country (1000 ha)
      1978-1982
      1982.0
      65286.0



In [4]:

    
data.info()









    



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 143280 entries, 0 to 143279
Data columns (total 7 columns):
country          143280 non-null object
region           143280 non-null object
variable         143280 non-null object
variable_full    143280 non-null object
time_period      143280 non-null object
year_measured    96411 non-null float64
value            96411 non-null float64
dtypes: float64(2), object(5)
memory usage: 7.7+ MB

Research the variables



In [5]:

    
data[['variable','variable_full']].drop_duplicates()









    Out[5]:






  
    
      
      variable
      variable_full
    
  
  
    
      0
      total_area
      Total area of the country (1000 ha)
    
    
      576
      arable_land
      Arable land area (1000 ha)
    
    
      1152
      permanent_crop_area
      Permanent crops area (1000 ha)
    
    
      1728
      cultivated_area
      Cultivated area (arable land + permanent crops...
    
    
      2304
      percent_cultivated
      % of total country area cultivated (%)
    
    
      2880
      total_pop
      Total population (1000 inhab)
    
    
      3456
      rural_pop
      Rural population (1000 inhab)
    
    
      4032
      urban_pop
      Urban population (1000 inhab)
    
    
      4608
      gdp
      Gross Domestic Product (GDP) (current US$)
    
    
      5184
      gdp_per_capita
      GDP per capita (current US$/inhab)
    
    
      5760
      agg_to_gdp
      Agriculture, value added to GDP (%)
    
    
      6336
      human_dev_index
      Human Development Index (HDI) [highest = 1] (-)
    
    
      6912
      gender_inequal_index
      Gender Inequality Index (GII) [equality = 0; i...
    
    
      7488
      percent_undernourished
      Prevalence of undernourishment (3-year average...
    
    
      8064
      number_undernourished
      Number of people undernourished (3-year averag...
    
    
      8640
      avg_annual_rain_depth
      Long-term average annual precipitation in dept...
    
    
      9216
      avg_annual_rain_vol
      Long-term average annual precipitation in volu...
    
    
      9792
      national_rainfall_index
      National Rainfall Index (NRI) (mm/year)
    
    
      10368
      surface_water_produced
      Surface water produced internally (10^9 m3/year)
    
    
      10944
      groundwater_produced
      Groundwater produced internally (10^9 m3/year)
    
    
      11520
      surface_groundwater_overlap
      Overlap between surface water and groundwater ...
    
    
      12096
      irwr
      Total internal renewable water resources (IRWR...
    
    
      12672
      irwr_per_capita
      Total internal renewable water resources per c...
    
    
      13248
      surface_entering
      Surface water: entering the country (total) (1...
    
    
      13824
      surface_inflow_submit_no_treaty
      Surface water: inflow not submitted to treatie...
    
    
      14400
      surface_inflow_submit_treaty
      Surface water: inflow submitted to treaties (1...
    
    
      14976
      surface_inflow_secure_treaty
      Surface water: inflow secured through treaties...
    
    
      15552
      total_flow_border_rivers
      Surface water: total flow of border rivers (10...
    
    
      16128
      accounted_flow_border_rivers
      Surface water: accounted flow of border rivers...
    
    
      16704
      accounted_flow
      Surface water: accounted inflow (10^9 m3/year)
    
    
      17280
      surface_to_other_countries
      Surface water: leaving the country to other co...
    
    
      17856
      surface_outflow_submit_no_treaty
      Surface water: outflow to other countries not ...
    
    
      18432
      surface_outflow_submit_treaty
      Surface water: outflow to other countries subm...
    
    
      19008
      surface_outflow_secure_treaty
      Surface water: outflow to other countries secu...
    
    
      19584
      surface_total_external_renewable
      Surface water: total external renewable (10^9 ...
    
    
      20160
      groundwater_entering
      Groundwater: entering the country (total) (10^...
    
    
      20736
      groundwater_accounted_inflow
      Groundwater: accounted inflow (10^9 m3/year)
    
    
      21312
      groundwater_to_other_countries
      Groundwater: leaving the country to other coun...
    
    
      21888
      groundwater_accounted_outflow
      Groundwater: accounted outflow to other countr...
    
    
      22464
      water_total_external_renewable
      Water resources: total external renewable (10^...
    
    
      23040
      total_renewable_surface
      Total renewable surface water (10^9 m3/year)
    
    
      23616
      total_renewable_groundwater
      Total renewable groundwater (10^9 m3/year)
    
    
      24192
      overlap_surface_groundwater
      Overlap: between surface water and groundwater...
    
    
      24768
      total_renewable
      Total renewable water resources (10^9 m3/year)
    
    
      25344
      dependency_ratio
      Dependency ratio (%)
    
    
      25920
      total_renewable_per_capita
      Total renewable water resources per capita (m3...
    
    
      26496
      exploitable_regular_renewable_surface
      Exploitable: regular renewable surface water (...
    
    
      27072
      exploitable_irregular_renewable_surface
      Exploitable: irregular renewable surface water...
    
    
      27648
      exploitable_total_renewable_surface
      Exploitable: total renewable surface water (10...
    
    
      28224
      exploitable_regular_renewable_groundwater
      Exploitable: regular renewable groundwater (10...
    
    
      28800
      exploitable_total
      Total exploitable water resources (10^9 m3/year)
    
    
      29376
      interannual_variability
      Interannual variability (WRI) (-)
    
    
      29952
      seasonal_variability
      Seasonal variability (WRI) (-)
    
    
      30528
      total_dam_capacity
      Total dam capacity (km3)
    
    
      31104
      dam_capacity_per_capita
      Dam capacity per capita (m3/inhab)
    
    
      31680
      irrigation_potential
      Irrigation potential (1000 ha)
    
    
      32256
      flood_occurence
      Flood occurrence (WRI) (-)
    
    
      32832
      total_pop_access_drinking
      Total population with access to safe drinking-...
    
    
      33408
      rural_pop_access_drinking
      Rural population with access to safe drinking-...
    
    
      33984
      urban_pop_access_drinking
      Urban population with access to safe drinking-...

Describe the panel

199 unique countries involved



In [6]:

    
data.country.nunique()









    Out[6]:





199



In [7]:

    
countries = data.country.unique()

For 12 time periods



In [8]:

    
data.time_period.nunique()









    Out[8]:





12

Each 5 years in length since 1958



In [9]:

    
time_periods = data.time_period.unique()
print(time_periods)









    



['1958-1962' '1963-1967' '1968-1972' '1973-1977' '1978-1982' '1983-1987'
 '1988-1992' '1993-1997' '1998-2002' '2003-2007' '2008-2012' '2013-2017']



In [10]:

    
mid_periods = range(1960,2017,5)

Dataset is unbalanced because there is not data for every country at every time period (more on missing data in the next notebook).



In [11]:

    
data[data.variable=='total_area'].value.isnull().sum()









    Out[11]:





220

Ways to look at this data

We can look at this data set in a number of ways:

Cross-section: all countries during a single time period
Time-series: a single country over time
Panel data: all countries over time (as data is given)
Geospatial: all countries in geographical relation to each other

Slicing

For a given time slice



In [12]:

    
def time_slice(df, time_period):

    # Only take data for time period of interest
    df = df[df.time_period==time_period] 

    # Pivot table 
    df = df.pivot(index='country', columns='variable', values='value')
    
    df.columns.name = time_period
    
    return df



In [13]:

    
time_slice(data, time_periods[0]).head()









    Out[13]:






  
    
      1958-1962
      accounted_flow
      accounted_flow_border_rivers
      agg_to_gdp
      arable_land
      avg_annual_rain_depth
      avg_annual_rain_vol
      cultivated_area
      dam_capacity_per_capita
      dependency_ratio
      exploitable_irregular_renewable_surface
      ...
      total_flow_border_rivers
      total_pop
      total_pop_access_drinking
      total_renewable
      total_renewable_groundwater
      total_renewable_per_capita
      total_renewable_surface
      urban_pop
      urban_pop_access_drinking
      water_total_external_renewable
    
    
      country
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      Afghanistan
      19.00
      9.0
      NaN
      7700.0
      327.0
      213.5000
      7760.0
      128.40
      28.7200
      NaN
      ...
      33.4
      9344.00
      NaN
      65.3300
      10.650
      6992.0
      55.68
      804.90
      NaN
      18.18
    
    
      Albania
      3.30
      0.0
      NaN
      436.0
      1485.0
      42.6900
      487.0
      NaN
      10.9300
      NaN
      ...
      0.0
      1738.00
      NaN
      30.2000
      6.200
      17376.0
      26.35
      533.20
      NaN
      3.30
    
    
      Algeria
      0.39
      0.0
      NaN
      6300.0
      89.0
      212.0000
      6900.0
      89.99
      3.5990
      5.0
      ...
      0.0
      11690.00
      NaN
      11.6700
      1.517
      998.3
      10.15
      3934.00
      NaN
      0.42
    
    
      Andorra
      NaN
      NaN
      NaN
      1.0
      NaN
      0.4724
      1.0
      NaN
      NaN
      NaN
      ...
      NaN
      15.38
      NaN
      0.3156
      NaN
      20520.0
      NaN
      9.76
      NaN
      NaN
    
    
      Angola
      0.40
      0.0
      NaN
      2700.0
      1010.0
      1259.0000
      3200.0
      25.96
      0.2695
      NaN
      ...
      0.0
      5466.00
      NaN
      148.4000
      58.000
      27150.0
      145.40
      577.00
      NaN
      0.40
    
  

5 rows × 60 columns

For a given country



In [14]:

    
def country_slice(df, country):
    
    # Only take data for country of interest
    df = df[df.country==country] 

    # Pivot table 
    df = df.pivot(index='variable', columns='time_period', values='value')
    
    df.index.name = country
    return df



In [15]:

    
country_slice(data, countries[40]).head()









    Out[15]:






  
    
      time_period
      1958-1962
      1963-1967
      1968-1972
      1973-1977
      1978-1982
      1983-1987
      1988-1992
      1993-1997
      1998-2002
      2003-2007
      2008-2012
      2013-2017
    
    
      Thailand
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      accounted_flow
      214.1
      214.10
      214.10
      214.10
      214.10
      214.10
      214.1
      214.100
      214.100
      214.100
      214.10
      214.1
    
    
      accounted_flow_border_rivers
      214.1
      214.10
      214.10
      214.10
      214.10
      214.10
      214.1
      214.100
      214.100
      214.100
      214.10
      214.1
    
    
      agg_to_gdp
      34.0
      29.24
      25.34
      24.78
      18.55
      15.73
      12.3
      9.067
      8.696
      9.351
      11.57
      10.5
    
    
      arable_land
      10600.0
      11600.00
      13150.00
      15773.00
      17199.00
      17930.00
      17238.0
      16242.000
      15389.000
      15200.000
      16560.00
      16810.0
    
    
      avg_annual_rain_depth
      1622.0
      1622.00
      1622.00
      1622.00
      1622.00
      1622.00
      1622.0
      1622.000
      1622.000
      1622.000
      1622.00
      1622.0

By variable



In [16]:

    
def variable_slice(df, variable):
    
    # Only data for that variable
    df = df[df.variable==variable]
    
    # Get variable for each country over the time periods 
    df = df.pivot(index='country', columns='time_period', values='value')
    return df



In [17]:

    
variable_slice(data, 'total_pop').head()









    Out[17]:






  
    
      time_period
      1958-1962
      1963-1967
      1968-1972
      1973-1977
      1978-1982
      1983-1987
      1988-1992
      1993-1997
      1998-2002
      2003-2007
      2008-2012
      2013-2017
    
    
      country
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      Afghanistan
      9344.00
      10369.00
      11717.00
      13056.00
      12667.00
      11338.00
      13746.0
      18034.00
      21487.00
      25878.00
      29727.00
      32527.00
    
    
      Albania
      1738.00
      1999.00
      2254.00
      2518.00
      2788.00
      3121.00
      3241.0
      3092.00
      3123.00
      3011.00
      2881.00
      2897.00
    
    
      Algeria
      11690.00
      13354.00
      15377.00
      17690.00
      20576.00
      23918.00
      27181.0
      29888.00
      31990.00
      34262.00
      37439.00
      39667.00
    
    
      Andorra
      15.38
      20.75
      26.89
      32.77
      39.11
      48.46
      58.9
      64.15
      71.05
      84.88
      79.32
      70.47
    
    
      Angola
      5466.00
      5963.00
      6588.00
      7501.00
      8808.00
      10286.00
      11849.0
      13802.00
      16110.00
      19184.00
      22686.00
      25022.00

Time series for given country and variable



In [18]:

    
def time_series(df, country, variable):
    
    # Only take data for country/variable combo 
    series = df[(df.country==country) & (df.variable==variable)]
    
    # Drop years with no data 
    series = series.dropna()[['year_measured', 'value']]
    
    # Change years to int and set as index 
    series.year_measured = series.year_measured.astype(int)
    series.set_index('year_measured', inplace=True)
    series.columns = [variable]
    return series



In [19]:

    
time_series(data, 'Belarus', 'total_pop')









    Out[19]:






  
    
      
      total_pop
    
    
      year_measured
      
    
  
  
    
      1992
      10235.0
    
    
      1997
      10091.0
    
    
      2002
      9826.0
    
    
      2007
      9556.0
    
    
      2012
      9491.0
    
    
      2015
      9496.0

By region

We may want to look at subsets of the data for certain assessments. Region is an intuitive way to subdivide the data.



In [20]:

    
data.region.unique()









    Out[20]:





array(['World | Asia',
       'Americas | Central America and Caribbean | Central America',
       'Americas | Central America and Caribbean | Greater Antilles',
       'Americas | Central America and Caribbean | Lesser Antilles and Bahamas',
       'Americas | Northern America | Northern America',
       'Americas | Northern America | Mexico',
       'Americas | Southern America | Guyana',
       'Americas | Southern America | Andean',
       'Americas | Southern America | Brazil',
       'Americas | Southern America | Southern America', 'World | Africa',
       'World | Europe', 'World | Oceania'], dtype=object)

Reducing the number of regions will help for pattern assessment.

Create a dictionary to look up new, more simple region (Asia, North America, South America, Africa, Europe, Oceania)



In [21]:

    
simple_regions ={
    'World | Asia':'Asia',
    'Americas | Central America and Caribbean | Central America': 'North America',
    'Americas | Central America and Caribbean | Greater Antilles': 'North America',
    'Americas | Central America and Caribbean | Lesser Antilles and Bahamas': 'North America',
    'Americas | Northern America | Northern America': 'North America',
    'Americas | Northern America | Mexico': 'North America',
    'Americas | Southern America | Guyana':'South America',
    'Americas | Southern America | Andean':'South America',
    'Americas | Southern America | Brazil':'South America',
    'Americas | Southern America | Southern America':'South America', 
    'World | Africa':'Africa',
    'World | Europe':'Europe', 
    'World | Oceania':'Oceania'
}



In [22]:

    
data.region = data.region.apply(lambda x: simple_regions[x])



In [23]:

    
print(data.region.unique())









    



['Asia' 'North America' 'South America' 'Africa' 'Europe' 'Oceania']

Function for extracting a single region:



In [24]:

    
def subregion(data, region):
    return data[data.region==region]

Note: The functions created in this notebook and the others can also be found in scripts/aqua_helper.py so that they can be reused in following notebooks without redefinition.

Exercises

Create a dataframe containing each variable for every country for the time period of 1963-1967.
Create a dataframe containing the total renewable surface water for each country over each time period.
Create a dataframe containing the total population of each country in Asia over each time period.

To do: Update investigation themes

	country	region	variable	variable_full	time_period	year_measured	value
0	Afghanistan	World \| Asia	total_area	Total area of the country (1000 ha)	1958-1962	1962.0	65286.0
1	Afghanistan	World \| Asia	total_area	Total area of the country (1000 ha)	1963-1967	1967.0	65286.0
2	Afghanistan	World \| Asia	total_area	Total area of the country (1000 ha)	1968-1972	1972.0	65286.0
3	Afghanistan	World \| Asia	total_area	Total area of the country (1000 ha)	1973-1977	1977.0	65286.0
4	Afghanistan	World \| Asia	total_area	Total area of the country (1000 ha)	1978-1982	1982.0	65286.0

	variable	variable_full
0	total_area	Total area of the country (1000 ha)
576	arable_land	Arable land area (1000 ha)
1152	permanent_crop_area	Permanent crops area (1000 ha)
1728	cultivated_area	Cultivated area (arable land + permanent crops...
2304	percent_cultivated	% of total country area cultivated (%)
2880	total_pop	Total population (1000 inhab)
3456	rural_pop	Rural population (1000 inhab)
4032	urban_pop	Urban population (1000 inhab)
4608	gdp	Gross Domestic Product (GDP) (current US$)
5184	gdp_per_capita	GDP per capita (current US$/inhab)
5760	agg_to_gdp	Agriculture, value added to GDP (%)
6336	human_dev_index	Human Development Index (HDI) [highest = 1] (-)
6912	gender_inequal_index	Gender Inequality Index (GII) [equality = 0; i...
7488	percent_undernourished	Prevalence of undernourishment (3-year average...
8064	number_undernourished	Number of people undernourished (3-year averag...
8640	avg_annual_rain_depth	Long-term average annual precipitation in dept...
9216	avg_annual_rain_vol	Long-term average annual precipitation in volu...
9792	national_rainfall_index	National Rainfall Index (NRI) (mm/year)
10368	surface_water_produced	Surface water produced internally (10^9 m3/year)
10944	groundwater_produced	Groundwater produced internally (10^9 m3/year)
11520	surface_groundwater_overlap	Overlap between surface water and groundwater ...
12096	irwr	Total internal renewable water resources (IRWR...
12672	irwr_per_capita	Total internal renewable water resources per c...
13248	surface_entering	Surface water: entering the country (total) (1...
13824	surface_inflow_submit_no_treaty	Surface water: inflow not submitted to treatie...
14400	surface_inflow_submit_treaty	Surface water: inflow submitted to treaties (1...
14976	surface_inflow_secure_treaty	Surface water: inflow secured through treaties...
15552	total_flow_border_rivers	Surface water: total flow of border rivers (10...
16128	accounted_flow_border_rivers	Surface water: accounted flow of border rivers...
16704	accounted_flow	Surface water: accounted inflow (10^9 m3/year)
17280	surface_to_other_countries	Surface water: leaving the country to other co...
17856	surface_outflow_submit_no_treaty	Surface water: outflow to other countries not ...
18432	surface_outflow_submit_treaty	Surface water: outflow to other countries subm...
19008	surface_outflow_secure_treaty	Surface water: outflow to other countries secu...
19584	surface_total_external_renewable	Surface water: total external renewable (10^9 ...
20160	groundwater_entering	Groundwater: entering the country (total) (10^...
20736	groundwater_accounted_inflow	Groundwater: accounted inflow (10^9 m3/year)
21312	groundwater_to_other_countries	Groundwater: leaving the country to other coun...
21888	groundwater_accounted_outflow	Groundwater: accounted outflow to other countr...
22464	water_total_external_renewable	Water resources: total external renewable (10^...
23040	total_renewable_surface	Total renewable surface water (10^9 m3/year)
23616	total_renewable_groundwater	Total renewable groundwater (10^9 m3/year)
24192	overlap_surface_groundwater	Overlap: between surface water and groundwater...
24768	total_renewable	Total renewable water resources (10^9 m3/year)
25344	dependency_ratio	Dependency ratio (%)
25920	total_renewable_per_capita	Total renewable water resources per capita (m3...
26496	exploitable_regular_renewable_surface	Exploitable: regular renewable surface water (...
27072	exploitable_irregular_renewable_surface	Exploitable: irregular renewable surface water...
27648	exploitable_total_renewable_surface	Exploitable: total renewable surface water (10...
28224	exploitable_regular_renewable_groundwater	Exploitable: regular renewable groundwater (10...
28800	exploitable_total	Total exploitable water resources (10^9 m3/year)
29376	interannual_variability	Interannual variability (WRI) (-)
29952	seasonal_variability	Seasonal variability (WRI) (-)
30528	total_dam_capacity	Total dam capacity (km3)
31104	dam_capacity_per_capita	Dam capacity per capita (m3/inhab)
31680	irrigation_potential	Irrigation potential (1000 ha)
32256	flood_occurence	Flood occurrence (WRI) (-)
32832	total_pop_access_drinking	Total population with access to safe drinking-...
33408	rural_pop_access_drinking	Rural population with access to safe drinking-...
33984	urban_pop_access_drinking	Urban population with access to safe drinking-...

1958-1962	accounted_flow	accounted_flow_border_rivers	agg_to_gdp	arable_land	avg_annual_rain_depth	avg_annual_rain_vol	cultivated_area	dam_capacity_per_capita	dependency_ratio	exploitable_irregular_renewable_surface	...	total_flow_border_rivers	total_pop	total_pop_access_drinking	total_renewable	total_renewable_groundwater	total_renewable_per_capita	total_renewable_surface	urban_pop	urban_pop_access_drinking	water_total_external_renewable
country
Afghanistan	19.00	9.0	NaN	7700.0	327.0	213.5000	7760.0	128.40	28.7200	NaN	...	33.4	9344.00	NaN	65.3300	10.650	6992.0	55.68	804.90	NaN	18.18
Albania	3.30	0.0	NaN	436.0	1485.0	42.6900	487.0	NaN	10.9300	NaN	...	0.0	1738.00	NaN	30.2000	6.200	17376.0	26.35	533.20	NaN	3.30
Algeria	0.39	0.0	NaN	6300.0	89.0	212.0000	6900.0	89.99	3.5990	5.0	...	0.0	11690.00	NaN	11.6700	1.517	998.3	10.15	3934.00	NaN	0.42
Andorra	NaN	NaN	NaN	1.0	NaN	0.4724	1.0	NaN	NaN	NaN	...	NaN	15.38	NaN	0.3156	NaN	20520.0	NaN	9.76	NaN	NaN
Angola	0.40	0.0	NaN	2700.0	1010.0	1259.0000	3200.0	25.96	0.2695	NaN	...	0.0	5466.00	NaN	148.4000	58.000	27150.0	145.40	577.00	NaN	0.40

time_period	1958-1962	1963-1967	1968-1972	1973-1977	1978-1982	1983-1987	1988-1992	1993-1997	1998-2002	2003-2007	2008-2012	2013-2017
Thailand
accounted_flow	214.1	214.10	214.10	214.10	214.10	214.10	214.1	214.100	214.100	214.100	214.10	214.1
accounted_flow_border_rivers	214.1	214.10	214.10	214.10	214.10	214.10	214.1	214.100	214.100	214.100	214.10	214.1
agg_to_gdp	34.0	29.24	25.34	24.78	18.55	15.73	12.3	9.067	8.696	9.351	11.57	10.5
arable_land	10600.0	11600.00	13150.00	15773.00	17199.00	17930.00	17238.0	16242.000	15389.000	15200.000	16560.00	16810.0
avg_annual_rain_depth	1622.0	1622.00	1622.00	1622.00	1622.00	1622.00	1622.0	1622.000	1622.000	1622.000	1622.00	1622.0

	total_pop
year_measured
1992	10235.0
1997	10091.0
2002	9826.0
2007	9556.0
2012	9491.0
2015	9496.0

Table of Contents