In [74]:
# import packages
import pandas as pd # data management
import matplotlib.pyplot as plt # graphics
import datetime as dt # check today's date
import sys # check Python version
import numpy as np
# IPython command, puts plots in notebook
%matplotlib inline
print('Today is', dt.date.today())
print('Python version:\n', sys.version, sep='')
Designed by Eugene Fama and Kenneth French, Fama-French factor model is a widely used tool in finance created by employing statistical techniques to estimate returns of stocks. Within this project, we attempted to analyze stock returns and risks by calculating betas of different industries over the past seven year period.
Source: Fama-French Website
We collected our data by using PANDAS DataReader to get a direct feed to Kenneth French's data, where numerous equity market data are available online. Among them, we used the "30 Industry Portfolio" dataset to compare stock returns and risks of different industries.
Links to data:
We first imported the 30 industry portfolio data set. There are different categories: value or equal weighted, monthly or annual, etc. Detailed breakdown and description are shown below.
Among many different types of data, we will extract value-weighted monthly return (dataframe: 0 in 30 Industry Portfolios) since 2010 to run our analysis and store it in a dataframe ind.
We also imported Fama-French 3-factor data with the same time frame, which contain:
Because we need market return rather than equity risk premium we add "Mkt" column in the dataframe by combining "Mkt-RF" and "RF". Finally, we set this 3-factor market data (Mkt-RF, SMB, HML, RF, and Mkt) in dataframe mkt1.
In [75]:
# importing 30 industry portfolio data set
import pandas_datareader.data as web
ff=web.DataReader("30_Industry_Portfolios", "famafrench")
In [76]:
print(ff['DESCR'])
In [77]:
# extracting value-weighted return only
ff[0]
Out[77]:
In [78]:
ind=ff[0]
In [79]:
ind.shape
Out[79]:
In [80]:
# importing mkt data from 3 factors model
mkt=web.DataReader("F-F_Research_Data_Factors", "famafrench")
In [81]:
mkt
Out[81]:
In [82]:
print(mkt['DESCR'])
In [83]:
# Dropping annual result
mkt1=mkt[0]
In [84]:
mkt1
Out[84]:
In [85]:
mkt1['Mkt']=mkt1['Mkt-RF']+mkt1['RF']
Now we add market return column ("Mkt" in dataframe mkt1) to ind dataframe.
Then key statistics, including means and standard deviations, are calculated to further derive betas of different industries. Since such statistics will be used in our beta calculation, we store it to a new dataframe ind_stat.
In [86]:
# Adding mkt data to 30 industry data set
ind['Mkt']=mkt1['Mkt']
In [87]:
ind.tail()
Out[87]:
In [88]:
# calculating historical average return and standard deviation
ind_stat=ind.describe()
ind_stat
Out[88]:
In order to facilitate matrix calculation, we altered the form of ind_stat to a inverse matrix and stored it as ind_stat_inv. By definition, industry betas are calculated as:
Beta = (covariance between market and an industry) / (variance of market)
Once we found industry betas, we created a new column "Beta" to our ind_stat_inv and sorted by beta in ascending order.
In [89]:
# inverse matrix
ind_stat_inv = pd.DataFrame(np.linalg.pinv(ind_stat.values), ind_stat.columns, ind_stat.index)
ind_stat_inv
Out[89]:
In [90]:
# beta calculation
def calc_beta(n):
np_array = ind.values
m = np_array[:,30] # market returns are column zero from numpy array
s = np_array[:,n] # stock returns are column one from numpy array
covariance = np.cov(s,m) # Calculate covariance between stock and market
beta = covariance[0,1]/covariance[1,1]
beta = covariance[0,1]/covariance[1,1]
return beta
In [91]:
numlist=range(0,31,1)
In [92]:
beta=[calc_beta(i) for i in numlist]
beta
Out[92]:
In [93]:
# Adding beta result
ind_stat_inv['Beta']=beta
ind_stat_inv
Out[93]:
In [94]:
sort=ind_stat_inv = ind_stat_inv.sort_values(by='Beta', ascending=False)
sort
Out[94]:
In [95]:
#Transpose industry returns table to make heatmap
ind_heatmap = ind.T
ind_heatmap.tail()
Out[95]:
In [96]:
#heatmap of monthly returns since 2010
import seaborn as sns
sns.set()
fig, ax = plt.subplots(figsize=(20,8))
sns.heatmap(ind_heatmap, annot=False, linewidths=.5)
ax.set_title("Monthly Returns by Industry (10 Years)")
Out[96]:
In [97]:
#Sort a beta-only table to create beta bar chart
beta_table = sort[['Beta']]
beta_table.head()
Out[97]:
In [98]:
#Bar chart of betas sorted from high to low
plt.style.use('seaborn-pastel')
ax = beta_table.plot(kind='bar', colormap = "Pastel2")
ax.set_title("Betas Across Industries")
Out[98]:
In [99]:
#Creating a dataframe just to see the most extreme values from the beta bar chart
industry_set = ind[['Coal ','Util ','Mkt']]
industry_set = industry_set.rename(columns={'Coal ':'Coal','Util ':'Utilities','Mkt':'Market'})
industry_set.tail()
Out[99]:
In [100]:
#Line plot of the returns of Coal, Utilities, and the general market
import seaborn as sns
plt.style.use('seaborn-pastel')
ax = industry_set.plot(linestyle='-', colormap = "Accent", figsize = (16,5))
ax.set_title("Monthly Returns over 10 Years")
Out[100]:
In [101]:
#Calculating a new dataframe to look at excess returns
industry_diff = industry_set
industry_diff['Coal Excess Returns'] = industry_set['Coal'] - industry_set['Market']
industry_diff['Utilities Excess Returns'] = industry_set['Utilities'] - industry_set['Market']
industry_diff = industry_diff.drop(industry_diff.columns[[0,1,2]], 1)
industry_diff.tail()
Out[101]:
In [102]:
#Line plot of the excess returns
plt.style.use('seaborn-pastel')
ax = industry_diff.plot(linestyle='-', colormap = "Accent", figsize = (16,5))
ax.set_title("Market Excess Returns")
Out[102]: