Install pandas-datareader Pandas datareader to allow you to read stock data from web.
In [4]:
from pandas_datareader import data, wb
import pandas as pd
import numpy as np
import datetime
import seaborn as sns
%matplotlib inline
In [5]:
start_date = pd.to_datetime('2006-01-01')
end_date = pd.to_datetime('2016-01-01')
In [6]:
#Bank of America
BAC = data.DataReader('BAC', 'google', start_date, end_date)
In [7]:
#CitiGroup
C = data.DataReader('C', 'google', start_date, end_date)
C.head()
Out[7]:
In [8]:
#Goldman Sachs
GS = data.DataReader('GS', 'google', start_date, end_date)
GS.head()
Out[8]:
In [9]:
#JPMorgan Chase
JPM = data.DataReader('JPM', 'google', start_date, end_date)
JPM.head()
Out[9]:
In [10]:
#Morgan Stanley
MS = data.DataReader('MS', 'google', start_date, end_date)
MS.head()
Out[10]:
In [11]:
#Wells Fargo
WFC = data.DataReader('WFC', 'google', start_date, end_date)
WFC.head()
Out[11]:
Create a list of the ticker symbols in alphabetical order
In [12]:
tickers = list(['BAC', 'C', 'GS', 'JPM', 'MS', 'WFC'])
tickers
Out[12]:
Use pd.concat to concatenate the bank dataframes together to a single data frame called bank_stocks. Set the keys argument equal to the tickers list. Also pay attention to what axis you concatenate on.
In [13]:
bank_stocks = pd.concat([BAC, C, GS, JPM, MS, WFC], axis=1, keys=tickers)
bank_stocks.head()
Out[13]:
Set the column name levels (this is filled out for you):
In [14]:
bank_stocks.columns.names = ['Bank Ticker','Stock Info']
Check the head of the bank_stocks dataframe.
In [15]:
bank_stocks.head()
Out[15]:
Reference: Multi-Level Indexing and Using .xs.
What is the max Close price for each bank's stock throughout the time period?
In [16]:
bank_stocks.xs('Close', level=1, axis=1).max()
Out[16]:
Create a new empty DataFrame called returns. This dataframe will contain the returns for each bank's stock. returns are typically defined by:
$$r_t = \frac{p_t - p_{t-1}}{p_{t-1}} = \frac{p_t}{p_{t-1}} - 1$$
In [17]:
returns = pd.DataFrame()
In [18]:
bank_df = bank_stocks.xs('Close', level=1, axis=1)
bank_df.head()
Out[18]:
We can use pandas pct_change() method on the Close column to create a column representing this return value. Create a for loop that goes and for each Bank Stock Ticker creates this returns column and set's it as a column in the returns DataFrame.
In [19]:
for t in tickers:
returns[t+' Return'] = bank_stocks[t]['Close'].pct_change()
returns.head()
Out[19]:
Create a pairplot using seaborn of the returns dataframe
In [20]:
#returns_drop_na = returns.dropna()
sns.pairplot(returns[1:])
Out[20]:
Using this returns DataFrame, figure out on what dates each bank stock had the best and worst single day returns. You should notice that 4 of the banks share the same day for the worst drop, did anything significant happen that day?
In [21]:
# Worst Drop (4 of them on Inauguration day)
returns.idxmin()
Out[21]:
You should have noticed that Citigroup's largest drop and biggest gain were very close to one another, did anything significant happen in that time frame?
In [22]:
# Citigroup stock split in 2011-05
returns.idxmax()
Out[22]:
Take a look at the standard deviation of the returns, which stock would you classify as the riskiest over the entire time period? Which would you classify as the riskiest for the year 2015?
In [23]:
returns.std()
Out[23]:
In [24]:
returns['2015'].std()
Out[24]:
Create a distplot using seaborn of the 2015 returns for Morgan Stanley
In [25]:
sns.distplot(returns['2015']['MS Return'], bins=100, color='green')
Out[25]:
Create a distplot using seaborn of the 2008 returns for CitiGroup
In [26]:
sns.distplot(returns['C Return']['2008'], color='red')
Out[26]:
In [27]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline
import plotly
import cufflinks as cf
cf.go_offline()
Create a line plot showing Close price for each bank for the entire index of time
In [28]:
close_stock_prices = bank_stocks.xs('Close', axis=1, level='Stock Info')
close_stock_prices.iplot(kind='line')
In [29]:
bank_stocks.xs('Close', axis=1, level='Stock Info').plot()
Out[29]:
In [30]:
BAC_30_2008_days = bank_stocks['2008']['BAC']['Close'].rolling(30).mean()
close_price_2008_BAC = bank_stocks['2008']['BAC']['Close']
BAC_30_2008_days.plot(figsize=(10,6), label='30 Days Average')
close_price_2008_BAC.plot(label='Close Price')
plt.legend()
Out[30]:
Create a heatmap of the correlation between the stocks Close Price
In [31]:
close_price = bank_stocks.xs('Close', axis=1, level='Stock Info')
sns.heatmap(close_price.corr(), annot=True)
Out[31]:
Seaborn's clustermap to cluster the correlations together
In [32]:
sns.clustermap(close_price.corr(), annot=True)
Out[32]:
In [33]:
bank_stocks.xs(key='Close',axis=1,level='Stock Info').corr().iplot(kind='heatmap', colorscale='rdylbu')
Create a candle plot of Bank of America's stock from Jan 1st 2015 to Jan 1st 2016
In [34]:
start_date = pd.to_datetime('2015-01-01')
end_date = pd.to_datetime('2016-01-01')
bank_stocks['BAC'][['Open', 'High', 'Low', 'Close']]['2015-01-01':'2016-01-01'].iplot(kind='candle')
In [ ]:
Create a Simple Moving Averages plot of Morgan Stanley for the year 2015.
In [35]:
bank_stocks['MS']['Close'].ix['2015-01-01':'2016-01-01'].ta_plot(study='sma', periods=[13,21,55],title='Simple Moving Averages')
Create a Bollinger Band Plot for Bank of America for the year 2015.
In [36]:
bank_stocks['BAC']['Close'].ix['2015-01-01':'2016-01-01'].ta_plot(study='boll')