Get Stock Data

In this exercise, we will download some stock ticker data and process it with pandas.

Install pandas_datareader with conda.


In [ ]:
import pandas as pd
from pandas_datareader import data, wb  
import datetime

# We will look at stock prices over the past year, starting at January 1, 2016
start = datetime.datetime(2016,1,1)
end = datetime.date.today()
 
# Let's get Apple stock data; Apple's ticker symbol is AAPL
# First argument is the series we want, second is the source ("yahoo" for Yahoo! Finance), third is the start date, fourth is the end date
apple = data.get_data_yahoo("AAPL", start, end)
apple.head()

Pandas Basics

  • Take first N values: df.head(N)
  • Get data for one column: df['column_name'] or df.column_name.
  • Get data for multiple columns of data: df[['col1', 'col2']]
  • Get data by a condition on the index: df.query('Date>2017')
  • Get data by a condition on the column: df[df.Close>140]
  • Summarize the data: df.describe()
  • Aggregate the data by 60 days: df.groupby(pd.TimeGrouper(freq='60D')).mean()

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Plot Daily Trend

Capture the daily Low, High, and Close and plot those values.


In [ ]:
import pylab as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (15, 9)   # Change the size of plots
 
## your plot code here

Add More Stocks

Grab the closing values from two more stocks.


In [ ]:
microsoft = data.get_data_yahoo("MSFT", start, end)
google = data.get_data_yahoo("GOOG", start, end)
 
stocks = pd.DataFrame({"AAPL": apple["Adj Close"],
                      "MSFT": microsoft["Adj Close"],
                      "GOOG": google["Adj Close"]})
 
stocks.head()

Plot Price of all three stocks


In [ ]:
# plot here

Apply Rolling Window

Read http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html and apply a 20-day rolling averaging window to the apple adjusted close.


In [ ]:
# stocks['AAPL']...

Profit

We'd like to normalize the share price so that we can fit all of the stock trends on one plot. To do that let's normalize by the price on the start date for each stock. Check the http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html documentation to apply this scaling to each column.

Plot resulting profit time series.


In [ ]:
# use apply function
stock_return = stocks # change this line

stock_return.head()

In [ ]:
# plot returns

S&P 500

So far we manually specified a few ticker symbols. In order to do larger scale analysis, we need to pull data for more tickers. To do that, we need to find a data source with the ticker names. Fortunately, Wikipedia has all of the S&P tickers in a html table at https://en.wikipedia.org/wiki/List_of_S%26P_500_companies. Use pd.read_html to grab this ticker data and then loop through all ticker and pull down the stock data for each ticker.


In [ ]:
url = 'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
## use sp500 = pd.read_html...

In [ ]:
## Loop through tickers

Profit

Create a new dataframe that calculates the profit for each stock and plot this profit across all stocks.


In [ ]:
## Use apply again
# stock_return_sp500 = df_sp500 # change this line

Profit Histogram

Plot the histogram of the profit.


In [ ]:
# plot hist

Use Pandas to find high-performance stocks

Retrun a list with all stock that doubled over the time period.


In [ ]:
# get high-performance stocks

In [ ]: