Self Employment Data 2015

from OECD


In [2]:
countries = ['AUS', 'AUT', 'BEL', 'CAN', 'CZE', 'FIN', 'DEU', 'GRC', 'HUN', 'ISL', 'IRL', 'ITA', 'JPN', 
             'KOR', 'MEX', 'NLD', 'NZL', 'NOR', 'POL', 'PRT', 'SVK', 'ESP', 'SWE', 'CHE', 'TUR', 'GBR', 
             'USA', 'CHL', 'COL', 'EST', 'ISR', 'RUS', 'SVN', 'EU28', 'EA19', 'LVA']

male_selfemployment_rates = [12.13246, 15.39631, 18.74896, 9.18314, 20.97991, 18.87097, 
                             13.46109, 39.34802, 13.3356, 16.83681, 25.35344, 29.27118, 
                             12.06516, 27.53898, 31.6945, 19.81751, 17.68489, 9.13669, 
                             24.15699, 22.95656, 19.00245, 21.16428, 13.93171, 8.73181, 
                             30.73483, 19.11255, 7.48383, 25.92752, 52.27145, 12.05042, 
                             15.8517, 8.10048, 19.02411, 19.59021, 19.1384, 14.75558]

female_selfemployment_rates = [8.18631, 10.38607, 11.07756, 8.0069, 12.78461, 
                               9.42761, 7.75637, 29.56566, 8.00408, 7.6802, 8.2774, 18.33204, 
                               9.7313, 23.56431, 32.81488, 13.36444, 11.50045, 4.57464, 
                               17.63891, 13.92678, 10.32846, 12.82925, 6.22453, 9.28793, 
                               38.32216, 10.21743, 5.2896, 25.24502, 49.98448, 6.624, 
                               9.0243, 6.26909, 13.46641, 11.99529, 11.34129, 8.88987]

countries_by_continent = {'AUS':'AUS', 'AUT':'EUR', 'BEL':'EUR', 'CAN':'AM', 
                          'CZE':'EUR', 'FIN':'EUR', 'DEU':'EUR', 'GRC':'EUR', 
                          'HUN':'EUR', 'ISL':'EUR', 'IRL':'EUR', 'ITA':'EUR', 
                          'JPN':'AS',  'KOR':'AS',  'MEX':'AM',  'NLD':'EUR', 
                          'NZL':'AUS', 'NOR':'EUR', 'POL':'EUR', 'PRT':'EUR', 
                          'SVK':'EUR', 'ESP':'EUR', 'SWE':'EUR', 'CHE':'EUR', 
                          'TUR':'EUR', 'GBR':'EUR', 'USA':'AM' , 'CHL':'AM', 
                          'COL':'AM' , 'EST':'EUR', 'ISR':'AS',  'RUS':'EUR', 
                          'SVN':'EUR', 'EU28':'EUR','EA19':'AS', 'LVA':'EUR'}

Solutions with Basic Python

Basic Calculations and Statistics

Exercise 1

Calculate for each country the overallselfemployment_rate:
selfemployment_rate:=(male_selfemployment_rates+female_selfemployment_rates)/2

(assumes that #women ~#men)

  1. Use a for-loop
  2. Use a list-comprehension and maybe zip

In [7]:
# TODO


Out[7]:
[10.159385,
 12.89119,
 14.913260000000001,
 8.59502,
 16.882260000000002,
 14.14929,
 10.608730000000001,
 34.45684,
 10.66984,
 12.258505,
 16.81542,
 23.80161,
 10.89823,
 25.551645,
 32.254690000000004,
 16.590975,
 14.59267,
 6.855665,
 20.89795,
 18.441670000000002,
 14.665455,
 16.996765,
 10.07812,
 9.00987,
 34.528495,
 14.66499,
 6.386715000000001,
 25.58627,
 51.127965,
 9.33721,
 12.437999999999999,
 7.184785,
 16.245260000000002,
 15.79275,
 15.239845,
 11.822725]

Exercise 2

Calculate

for/of all selfemployment_rates.


In [8]:
# TODO max


Out[8]:
51.127965

In [9]:
# TODO min


Out[9]:
6.386715000000001

In [10]:
# TODO sum


Out[10]:
603.3900649999999

In [12]:
# TODO mean


Out[12]:
16.760835138888886

In [13]:
# TODO standard deviation


Out[13]:
9.21531227766451

Exercise 3

Find the Country with the highest selfemployment_rate.


In [15]:
# TODO


Out[15]:
'COL'

Exercise 4

Find the the sum of all selfemployment_rates, which are between 10-15.

  1. with loop
  2. using filter

In [18]:
# TODO


Out[18]:
174.81038999999996

Aggregetions

Exercise 5

Calculate the mean of the selfemployment-rates per continent.
Consider to use zip and collections.defaultdict.


In [17]:
# TODO (but not before we start with pandas...or if you are very fast ;-) )


Out[17]:
{'AM': 24.790132, 'AS': 16.03193, 'AUS': 12.3760275, 'EUR': 15.6223852}