In [ ]:
import requests
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
%matplotlib inline
Objectives:
DataFrame
DataFrame
to a csv fileDownload a file called corssCountryIncomePerCapita.csv
by visiting http://www.briancjenkins.com/data/international/ and following the link for: "GDP per capita (constant US 2005 PPP $, levels)"
In [ ]:
# Use the requests module to download cross country GDP per capita
url = ''
filename=''
r = requests.get(url,verify=True)
with open(filename,'wb') as newFile:
newFile.write(r.content)
In [ ]:
# Import the cross-country GDP data into a DataFrame called incomeDf with index_col=0
# Print the first five rows of incomeDf
In [ ]:
# Print the columns of incomeDf
In [ ]:
# Print the number of countries represented in incomeDf
In [ ]:
# Print the index of incomeDf
In [ ]:
# Print the number of years of data in incomeDf
In [ ]:
# Print the first five rows of the 'United States - USA' column of incomeDf
In [ ]:
# Print the last five rows of the 'United States - USA' column of incomeDf
In [ ]:
# Create a plot of income per capita from 1960 to 2011 for the US
In [ ]:
# Create a plot of income per capita from 1960 to 2011 for another country in the dataset
In [ ]:
# Create a new variable called income60 equal to the 1960 row from incomeDf
# Print the index of income60
In [ ]:
# Print the average world income per capita in 1960
# Print the standard deviation in world income per capita in 1960
In [ ]:
# Print the names of the five countries with the highest five incomes per capita in 1960
In [ ]:
# Print the names of the five countries with the lowest five incomes per capita in 1960
In [ ]:
# Create a new variable called income11 equal to the 2011 row from incomeDf
# Print the average world income per capita in 2011
# Print the standard deviation in world income per capita in 2011
In [ ]:
# Print the names of the five countries with the highest five incomes per capita in 2011
In [ ]:
# Print the names of the five countries with the lowest five incomes per capita in 2011
In [ ]:
# Create a DataFrame called growthDf with columns 'income 1960' and 'income 2011' equal to income per capita
# in 1960 and 2011 and an index equal to the index of income60
In [ ]:
# Create a new column equal to the difference between 'income 2011' and 'income 1960' for each country
Let $y_t$ denotes income per capita for some country in some year $t$ and let $g$ denotes the average annual growth in income per capita between years 0 and $T$. $g$ is defined by:
\begin{align}
y_T & = (1+g)^T y_0
\end{align}
which implies:
\begin{align}
g & = \left(\frac{y_T}{y_0}\right)^{1/T} - 1
\end{align}
Note that since our data are from 1960 to 2011, $T = 51$. Which is also equal to len(incomeDf.index)-1
.
In [ ]:
# Create a new column equal to the average annual growth rate between for each country between 1960 and 2011
In [ ]:
# Print the first five rows of growthDf
In [ ]:
# Print the names of the five countries with the highest average annual growth rates
In [ ]:
# Print the names of the five countries with the lowest average annual growth rates
In [ ]:
# Print the average annual growth rate of income per capita from 1960 to 2011
# Print the standard deviation of the annual growth rate of income per capita from 1960 to 2011
In [ ]:
# Construct a scatter plot:
# Use the plt.scatter function
# income per capita in 1960 on the horizontal axis and average annual growth rate on the vertical axis
# Set the opacity of the points to something like 0.25 - 0.35
# Label the plot clearly with axis labels and a title
In [ ]:
# Export the growthDf DataFrame to a csv file called 'growth_data.csv'