Self Employment Data 2015

from OECD


In [1]:
countries = ['AUS', 'AUT', 'BEL', 'CAN', 'CZE', 'FIN', 'DEU', 'GRC', 'HUN', 'ISL', 'IRL', 'ITA', 'JPN', 
             'KOR', 'MEX', 'NLD', 'NZL', 'NOR', 'POL', 'PRT', 'SVK', 'ESP', 'SWE', 'CHE', 'TUR', 'GBR', 
             'USA', 'CHL', 'COL', 'EST', 'ISR', 'RUS', 'SVN', 'EU28', 'EA19', 'LVA']

male_selfemployment_rates = [12.13246, 15.39631, 18.74896, 9.18314, 20.97991, 18.87097, 
                             13.46109, 39.34802, 13.3356, 16.83681, 25.35344, 29.27118, 
                             12.06516, 27.53898, 31.6945, 19.81751, 17.68489, 9.13669, 
                             24.15699, 22.95656, 19.00245, 21.16428, 13.93171, 8.73181, 
                             30.73483, 19.11255, 7.48383, 25.92752, 52.27145, 12.05042, 
                             15.8517, 8.10048, 19.02411, 19.59021, 19.1384, 14.75558]

female_selfemployment_rates = [8.18631, 10.38607, 11.07756, 8.0069, 12.78461, 
                               9.42761, 7.75637, 29.56566, 8.00408, 7.6802, 8.2774, 18.33204, 
                               9.7313, 23.56431, 32.81488, 13.36444, 11.50045, 4.57464, 
                               17.63891, 13.92678, 10.32846, 12.82925, 6.22453, 9.28793, 
                               38.32216, 10.21743, 5.2896, 25.24502, 49.98448, 6.624, 
                               9.0243, 6.26909, 13.46641, 11.99529, 11.34129, 8.88987]

In [2]:
import numpy as np

np_male_selfemployment_rates = np.array(male_selfemployment_rates)
np_male_selfemployment_rates


Out[2]:
array([ 12.13246,  15.39631,  18.74896,   9.18314,  20.97991,  18.87097,
        13.46109,  39.34802,  13.3356 ,  16.83681,  25.35344,  29.27118,
        12.06516,  27.53898,  31.6945 ,  19.81751,  17.68489,   9.13669,
        24.15699,  22.95656,  19.00245,  21.16428,  13.93171,   8.73181,
        30.73483,  19.11255,   7.48383,  25.92752,  52.27145,  12.05042,
        15.8517 ,   8.10048,  19.02411,  19.59021,  19.1384 ,  14.75558])

In [3]:
np_female_selfemployment_rates = np.array(female_selfemployment_rates)
np_female_selfemployment_rates


Out[3]:
array([  8.18631,  10.38607,  11.07756,   8.0069 ,  12.78461,   9.42761,
         7.75637,  29.56566,   8.00408,   7.6802 ,   8.2774 ,  18.33204,
         9.7313 ,  23.56431,  32.81488,  13.36444,  11.50045,   4.57464,
        17.63891,  13.92678,  10.32846,  12.82925,   6.22453,   9.28793,
        38.32216,  10.21743,   5.2896 ,  25.24502,  49.98448,   6.624  ,
         9.0243 ,   6.26909,  13.46641,  11.99529,  11.34129,   8.88987])

Solutions with numpy

Basic Calculations and Statistics

Exercise 1

Calculate for each country the overallselfemployment_rate:
selfemployment_rate:=(male_selfemployment_rates+female_selfemployment_rates)/2

(assumes that #women ~#men)


In [5]:
# TODO


Out[5]:
array([ 10.159385,  12.89119 ,  14.91326 ,   8.59502 ,  16.88226 ,
        14.14929 ,  10.60873 ,  34.45684 ,  10.66984 ,  12.258505,
        16.81542 ,  23.80161 ,  10.89823 ,  25.551645,  32.25469 ,
        16.590975,  14.59267 ,   6.855665,  20.89795 ,  18.44167 ,
        14.665455,  16.996765,  10.07812 ,   9.00987 ,  34.528495,
        14.66499 ,   6.386715,  25.58627 ,  51.127965,   9.33721 ,
        12.438   ,   7.184785,  16.24526 ,  15.79275 ,  15.239845,
        11.822725])

Exercise 2

Calculate

for/of all selfemployment_rates.


In [7]:
# TODO max


Out[7]:
51.127965000000003

In [8]:
# TODO min


Out[8]:
6.3867150000000006

In [9]:
# TODO sum


Out[9]:
603.39006499999994

In [10]:
# TODO mean


Out[10]:
16.760835138888886

In [11]:
# TODO standard deviation


Out[11]:
9.2153122776645127

Exercise 3

Find the Country with the highest selfemployment_rate.


In [13]:
# TODO


Out[13]:
'COL'

Exercise 4

Find the the sum of all selfemployment_rates, which are between 10-15.


In [14]:
# TODO


Out[14]:
174.81038999999996

Aggregetions


In [8]:
# not easier than with simple python => Won't do here (again)