I've decided to explore the data about a happiness indicator from across the world. I found a dataset about the subject from year 2016. The first task would be to check and clean up the data.
Out[185]:
|
Country |
Region |
Happiness Rank |
Happiness Score |
Lower Confidence Interval |
Upper Confidence Interval |
Economy (GDP per Capita) |
Family |
Health (Life Expectancy) |
Freedom |
Trust (Government Corruption) |
Generosity |
Dystopia Residual |
0 |
Denmark |
Western Europe |
1 |
7.526 |
7.460 |
7.592 |
1.44178 |
1.16374 |
0.79504 |
0.57941 |
0.44453 |
0.36171 |
2.73939 |
1 |
Switzerland |
Western Europe |
2 |
7.509 |
7.428 |
7.590 |
1.52733 |
1.14524 |
0.86303 |
0.58557 |
0.41203 |
0.28083 |
2.69463 |
2 |
Iceland |
Western Europe |
3 |
7.501 |
7.333 |
7.669 |
1.42666 |
1.18326 |
0.86733 |
0.56624 |
0.14975 |
0.47678 |
2.83137 |
3 |
Norway |
Western Europe |
4 |
7.498 |
7.421 |
7.575 |
1.57744 |
1.12690 |
0.79579 |
0.59609 |
0.35776 |
0.37895 |
2.66465 |
4 |
Finland |
Western Europe |
5 |
7.413 |
7.351 |
7.475 |
1.40598 |
1.13464 |
0.81091 |
0.57104 |
0.41004 |
0.25492 |
2.82596 |
Since there are spaces in the column names, I want to rename them with a more easily referrable names for future use.
Out[187]:
|
country |
region |
happiness_rank |
happiness_score |
lower |
upper |
gdp_capita |
family |
health_life_exp |
freedom |
gov_trust |
generosity |
dystopia_res |
0 |
Denmark |
Western Europe |
1 |
7.526 |
7.460 |
7.592 |
1.44178 |
1.16374 |
0.79504 |
0.57941 |
0.44453 |
0.36171 |
2.73939 |
1 |
Switzerland |
Western Europe |
2 |
7.509 |
7.428 |
7.590 |
1.52733 |
1.14524 |
0.86303 |
0.58557 |
0.41203 |
0.28083 |
2.69463 |
2 |
Iceland |
Western Europe |
3 |
7.501 |
7.333 |
7.669 |
1.42666 |
1.18326 |
0.86733 |
0.56624 |
0.14975 |
0.47678 |
2.83137 |
3 |
Norway |
Western Europe |
4 |
7.498 |
7.421 |
7.575 |
1.57744 |
1.12690 |
0.79579 |
0.59609 |
0.35776 |
0.37895 |
2.66465 |
4 |
Finland |
Western Europe |
5 |
7.413 |
7.351 |
7.475 |
1.40598 |
1.13464 |
0.81091 |
0.57104 |
0.41004 |
0.25492 |
2.82596 |
Let's make the first visualization of the distribution of average happiness_score by region.
Out[188]:
|
region |
happiness_score |
0 |
Australia and New Zealand |
7.323500 |
1 |
Central and Eastern Europe |
5.370690 |
2 |
Eastern Asia |
5.624167 |
3 |
Latin America and Caribbean |
6.101750 |
4 |
Middle East and Northern Africa |
5.386053 |
As you can see, the developed countries do feel happier on average than the developed countries.
Next it would be interesting to see, which aspect of life contributes to the score by how much on average.
As the chart shows, poeple's happiness on average is driven by wealth, health, love and freedom. Worthy of note that wealth is the most important contributor while government trust is the least important.
Next, I show the countries whose happiness scores are the closest to their dystopia residual (the least happy place on earth). Dystopia is a hypothetical country that has the worst scores for each contributing factor where the least happy people live.
Out[195]:
'file:///Users/AnaS/Desktop/world-happiness-report/temp-plot.html'
As shown in the table, people who are living their worst dystopian nightmare are from Sub-Saharan Africa, which sadly makes sense because these countries are most deprived of all the factors that contribute to happiness according to this particular study.
In the next graph, I explore the distribution of the happiness score to check if it's normal. As you can see immediatly below, it is not exactly normal, for more evidence, I made a probability plot, which agrees with the density plot.