Warming up

Value of an European Call Option

The Black-Scholes-Merton model is a mathematical model applicable in financial markets containing derivative investment instruments for the theoretical estimation of European-style options' price. According to such model, the index level at maturity of a call option can be modeled as it follows with z normally distributed

$$S_T(z)=S_0e^{(r-0.5\sigma^2)T+\sigma\sqrt{T}z}$$

where

  • $S_0$ is the initial stock index level
  • K is the strike price of the European call option
  • T is the Time-to-maturity in year
  • r is the riskless short rate
  • $\sigma^2$ is the volatility

Let's assume the following values

  • $S_0 = 105$
  • K = 109
  • T = 1
  • r = 6%
  • $\sigma^2$ = 22%

In order to calculate the present value of the call option, it is possible to apply a Monte Carlo approach according to the following procedure:

  • Let's consider R realizations of the random variable z from a standard normal distribution
  • Compute related index levels at maturity $S_T(i)$ for $z(i)\in\{1,2,...,R\}$
  • Compute the call option value $h_T(i)=\max{(S_T(i)–K,0)}$.
  • Compute the present value of the call option, i.e.
$$C_0=e^{-rT}{R}^{-1}\sum_{i=1}^{R} h_T{(i)}$$

In [2]:
import numpy as np

def bsm(S0,r,sigma,T,K,R = 100000 , seed=500):
    np.random.seed(seed)
    z = np.random.standard_normal(R) 
    ST = S0 * np.exp(( r - 0.5 * sigma ** 2) * T + sigma * np.sqrt(T) * z)
    hT = np.maximum(ST - K, 0)
    C0 = np.exp(-r * T) * np.sum(hT) / R
    return C0

In [3]:
import time
tm = time.time()
C0 = bsm(S0=105,r=0.06,sigma=0.22,T=1.0,K=109,R = 100000 , seed=500)
pm = time.time() - tm
print("Value of European Call Option: {0:.4g}".format(C0)+" - time[{0:.4g} secs]".format(pm))


Value of European Call Option: 10.31 - time[0.01564 secs]

Let's see how much time is necessary for 70,000,000 iterations intead of 100,000 iterations.


In [4]:
tm = time.time()
C0 = bsm(S0=105,r=0.06,sigma=0.22,T=1.0,K=109,R = 70000000 , seed=500)
pm = time.time() - tm
print("Value of European Call Option: {0:.4g}".format(C0)+" - time[{0:.4g} secs]".format(pm))


Value of European Call Option: 10.32 - time[6.039 secs]

Let's see how we can speed up the computation with the numexpr package.


In [5]:
import numexpr as ne
def bsm_ne(S0,r,sigma,T,K,R = 70000000 , seed=500):
    np.random.seed(seed)
    z = np.random.standard_normal(R) 
    ST = ne.evaluate('S0 * exp(( r - 0.5 * sigma ** 2) * T + sigma * sqrt(T) * z)')
    hT = np.maximum(ST - K, 0)
    C0 = np.exp(-r * T) * np.sum(hT) / R
    return C0

In [6]:
tm = time.time()
C0 = bsm_ne(S0=105,r=0.06,sigma=0.22,T=1.0,K=109,R = 70000000 , seed=500)
pm = time.time() - tm
print("Value of the European Call Option: {0:.4g}".format(C0)+" - time[{0:.4g} secs]".format(pm))


Value of the European Call Option: 10.32 - time[4.821 secs]

Key Factors for Evaluating the Performance of a Portfolio

The daily return of a stock is easily computable as it follows $$dr(t)=\frac{P(t)}{P(t-1)}-1$$ Similarly, the cumulative return of a stock is easily computable as it follows $$cr(t)=\frac{P(t)}{P(0)}-1$$ What is it P(t)? There are basically 2 options for this value, i.e.

  • the adjusted close price of a stock typically indicated in financial feed as Adj Close, or
  • the close price of a stock typically indicated in financial feed as Adj.

We take the adjusted close price (see What Hedge Funds Really Do to understand why).

Typically for evaluating the performance of a portfolio key factors to focus on are

  1. Cumulative return
  2. Average daily return
  3. Rsk (Standard deviation of daily return)
  4. Sharpe ratio

We will see how to compute and plot these factors.

Get financial data

Functions from pandas.io.data and pandas.io.ga extract data from various Internet sources into a DataFrame The following sources are supported:

  • Yahoo! Finance
  • Google Finance
  • St.Louis FED (FRED)
  • Kenneth French’s data library
  • World Bank
  • Google Analytics

For further info see pandas documentation


In [9]:
import numpy as np
import pandas as pd
import pandas.io.data as web

df_final = web.DataReader(['GOOG','SPY'], data_source='yahoo',
                      start='1/21/2010', end='4/15/2016')

print(df_final)
print(df_final.shape)


<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 1570 (major_axis) x 2 (minor_axis)
Items axis: Open to Adj Close
Major_axis axis: 2010-01-21 00:00:00 to 2016-04-15 00:00:00
Minor_axis axis: GOOG to SPY
(6, 1570, 2)

In [10]:
df_final.ix[:,:,'SPY'].head()


Out[10]:
Open High Low Close Volume Adj Close
Date
2010-01-21 113.919998 114.269997 111.559998 111.699997 344859600.0 97.864196
2010-01-22 111.199997 111.739998 109.089996 109.209999 345942400.0 95.682624
2010-01-25 110.209999 110.410004 109.410004 109.769997 186937500.0 96.173257
2010-01-26 109.339996 110.470001 109.040001 109.309998 211168800.0 95.770236
2010-01-27 109.169998 110.080002 108.330002 109.830002 271863600.0 96.225829

In [11]:
print(type(df_final.ix[:,:,'SPY']))
print("\n>>> null values:"+str(pd.isnull(df_final.ix[:,:,'GOOG']).sum().sum()))


<class 'pandas.core.frame.DataFrame'>

>>> null values:0

In [12]:
df_final = web.DataReader(['GOOG','SPY'], data_source='yahoo',
                      start='1/21/1999', end='4/15/2016')
df_final.ix[:,:,'GOOG'].head()


Out[12]:
Open High Low Close Volume Adj Close
Date
1999-01-21 NaN NaN NaN NaN NaN NaN
1999-01-22 NaN NaN NaN NaN NaN NaN
1999-01-25 NaN NaN NaN NaN NaN NaN
1999-01-26 NaN NaN NaN NaN NaN NaN
1999-01-27 NaN NaN NaN NaN NaN NaN

In [13]:
print(type(df_final.ix[:,:,'GOOG']))
print("\n>>> null values:"+str(pd.isnull(df_final.ix[:,:,'GOOG']).sum().sum()))


<class 'pandas.core.frame.DataFrame'>

>>> null values:8412

There is a couple of observations to be done:

  1. calling pandas.io.data with multiple stocks gets a pandas.core.panel.Panel instead of a pandas.DataFrame but filtering to specific axis (e.g. Google) we get pandas.core.frame.DataFrame
  2. pandas.io.data does not handle missing values

Hence, we can define the following functions.


In [14]:
import matplotlib.pyplot as plt
def get_data(symbols, 
             add_ref=True,
             data_source='yahoo',
             price='Adj Close',
             start='1/21/2010', 
             end='4/15/2016'):
    """Read stock data (adjusted close) for given symbols from."""
    
    if add_ref and 'SPY' not in symbols:  # add SPY for reference, if absent
        symbols.insert(0, 'SPY')

    df = web.DataReader(symbols, 
                        data_source=data_source,
                        start=start, 
                        end=end)
    
    return df[price,:,:]


get_data(symbols=['GOOG','SPY']).tail()


Out[14]:
GOOG SPY
Date
2016-04-11 736.099976 202.964506
2016-04-12 743.090027 204.854670
2016-04-13 751.719971 206.923911
2016-04-14 753.200012 206.933854
2016-04-15 759.000000 206.705048

Also, notice that it is not necessary to perform an initial join with the data range of interest filtering out non trading days, as padas does it for us , i.e.


In [15]:
df_stock = get_data(symbols=['GOOG','SPY'],start='1/21/1999',end='4/15/2016')
print(">> Trading days from pandas:"+str(df_stock.shape[0]))
dates = pd.date_range('1/21/1999', '4/15/2016')
df = pd.DataFrame(index=dates)
print(">> Calendar days:"+str(df.shape[0]))
df = df.join(df_stock)
print(">> After join:"+str(df.shape[0]))
df = df.dropna(subset=["SPY"])
print(">> After removing non trading days:"+str(df.shape[0]))


>> Trading days from pandas:4337
>> Calendar days:6295
>> After join:6295
>> After removing non trading days:4337

Plotting stock prices


In [16]:
ax = get_data(symbols=['GOOG','SPY','IBM','GLD'],start='1/21/1999', end='4/15/2016').plot(title="Stock Data", fontsize=9)
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show()


Imputing missing values

As clear from above plot, we need to handle missing values.


In [17]:
def fill_missing_values(df_data):
    """Fill missing values in data frame, in place."""
    df_data.fillna(method='ffill',inplace=True)
    df_data.fillna(method='backfill',inplace=True)
    return df_data

ax = fill_missing_values(get_data(symbols=['GOOG','SPY','IBM','GLD'],
                             start='1/21/1999', 
                             end='4/15/2016')).plot(title="Stock Data", fontsize=9)
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show()


Normalizing prices


In [18]:
def normalize_data(df):
    return df/df.ix[0,:]

ax = normalize_data(
    fill_missing_values(
        get_data(
            symbols=['GOOG','SPY','IBM','GLD'],
                             start='1/21/1999', 
                             end='4/15/2016'))).plot(title="Stock Data", fontsize=9)
ax.set_xlabel("Date")
ax.set_ylabel("Normalized price")
plt.show()


Rolling statistics

  • Notice that pandas.rolling_mean has been deprecated for DataFrame and will be removed in a future version. Hence, we will replace it with DataFrame.rolling(center=False,window=20).mean()
  • Notice that pd.rolling_std has been deprecated for DataFrame and will be removed in a future version. Hence, we will replace it with DataFrame.rolling(center=False,window=20).std()

In [19]:
df = fill_missing_values(
        get_data(
            symbols=['GOOG','SPY','IBM','GLD'],
                             start='4/21/2015', 
                             end='7/15/2016'))

# 1. Computing rolling mean using a 20-day window 
rm_df = pd.DataFrame.rolling(df, window=20).mean()

ax = rm_df.plot(title="Rolling Mean")
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show()

# 2. Computing rolling standard deviation using a 20-day window 
rstd_df = pd.DataFrame.rolling(df, window=20).std()

ax = rstd_df.plot(title="Rolling Standard Deviation")
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show()


# 3. Compute upper and lower bands
def get_bollinger_bands(rm, rstd):
    """Return upper and lower Bollinger Bands."""
    upper_band, lower_band = rm + 2 * rstd, rm - 2 * rstd
    return upper_band, lower_band

upper_band, lower_band = get_bollinger_bands(rm_df, rstd_df)

# Plot raw SPY values, rolling mean and Bollinger Bands
ax = df['SPY'].plot(title="Bollinger Bands",label='SPY')
rm_df['SPY'].plot(label='Rolling mean', ax=ax)
upper_band['SPY'].plot(label='upper band', ax=ax)
lower_band['SPY'].plot(label='lower band', ax=ax)

# Add axis labels and legend
ax.set_xlabel("Date")
ax.set_ylabel("Price")
ax.legend(loc='lower left')
plt.show()


Daily returns

There are two ways to compute the daily return of a stock with pandas. We check they produce same results and plot them.


In [20]:
def compute_daily_returns_2(df):
    """Compute and return the daily return values."""
    # Note: Returned DataFrame must have the same number of rows
    daily_returns = df.copy()
    daily_returns[1:] = (df[1:]/df[:-1].values) - 1
    daily_returns.ix[0,:] = 0 
    return daily_returns

def compute_daily_returns(df):
    """Compute and return the daily return values."""
    # Note: Returned DataFrame must have the same number of rows
    daily_returns = (df / df.shift(1)) - 1 
    daily_returns.ix[0,:] = 0 
    return daily_returns

pd.util.testing.assert_frame_equal(
    compute_daily_returns(
        fill_missing_values(
            get_data(
                symbols=['GOOG','SPY','IBM','GLD'],
                start='4/21/2016', 
                end='7/15/2016'))), 
    compute_daily_returns_2(
        fill_missing_values(
            get_data(
                symbols=['GOOG','SPY','IBM','GLD'],
                start='4/21/2016', 
                end='7/15/2016'))))

ax = compute_daily_returns(fill_missing_values(
        get_data(
            symbols=['GOOG','SPY','IBM','GLD'],
                             start='4/21/2016', 
                             end='7/15/2016'))).plot(title="Daily returns")

ax.set_xlabel("Date")
ax.set_ylabel("Daily return")
plt.show()



In [106]:
df = compute_daily_returns(fill_missing_values(get_data(
            symbols=['SPY'], 
            start='4/21/2000', 
            end='7/15/2016')))

plt.hist(df['SPY'],bins=30,color='c',label=['Daily return'])
plt.axvline(df['SPY'].mean(), color='b', linestyle='dashed', linewidth=2 , label='Mean')
plt.axvline(-df['SPY'].std(), color='r', linestyle='dashed', linewidth=2 , label='Std')
plt.axvline(df['SPY'].std(), color='r', linestyle='dashed', linewidth=2 )
plt.title('SPY daily return distribution')
plt.xlabel('Daily return')
plt.grid(True)
plt.legend()

plt.show()


Cumulative returns


In [107]:
def cumulative_returns(df):
    return df/df.ix[0,:] - 1 

ax = cumulative_returns(fill_missing_values(
        get_data(
            symbols=['GOOG','SPY','IBM','GLD'],
                             start='4/21/2016', 
                             end='7/15/2016'))).plot(title="Cumulative returns")

ax.set_xlabel("Date")
ax.set_ylabel("Cumulative return")
plt.show()


Sharpe Ratio

Sharpe ratio is a way to examine the performance of an investment by adjusting for its risk. The ratio measures the excess return (or risk premium) per unit of deviation in an investment asset or a trading strategy, typically referred to as risk (and is a deviation risk measure), named after William F. Sharpe.

The ex-ante Sharpe ratio is defined as: $$S=\frac{E\{R_p-R_f\}}{Std\{R_p-R_f\}}$$

where $R_p$ is the asset return, $R_f$ is the return on a benchmark asset, such as the risk free rate of return. $E\{R_p-R_f\}$ is the expected value of the excess of the asset return over the benchmark return, and $Std\{R_p-R_f\}$ is the standard deviation of the asset return.

The ex-post Sharpe ratio uses the same equation as the one above but with realized returns of the asset and benchmark rather than expected returns.

Examples of risk free rates are

  • LIBOR
  • Interest rate of 3 month treasury bill
  • 0% (approximation used a lot in last years)

Using daily returns as $R_p$ the ex-post Sharpe ratio is computable as: $$S=\frac{mean\{R_p^{daily}-R_f^{daily}\}}{Std\{R_p^{daily}\}}$$

as $R_f^{daily}$ is typically costant for several months it is possible to remove from the denominator. Also, $R_f^{daily}$ is typically approximable with 0% but it is possible to compute given $R_f^{yearly}$ as it follows remembering there are 252 trading days in a year

$$R_f^{daily}=\sqrt[252]{1+R_f^{yearly}}$$

Sharpe ratio is typically an annual measure. This means if we are using different sample frequencies need to apply the following formula: $$S^{annual}=\sqrt[2]{SPY} \times S$$

where $SPR$ is the number of samples per year considered in computing $S$, e.g.

  • $S^{annual}=\sqrt[2]{252} \times S^{daily}$
  • $S^{annual}=\sqrt[2]{52} \times S^{weekly}$
  • $S^{annual}=\sqrt[2]{12} \times S^{monthly}$

In [135]:
def sharpe_ratio(df,sample_freq='d',risk_free_rate=0.0):
    
    sr = (df - risk_free_rate).mean() / df.std()
    
    if sample_freq == 'd':
        sr = sr * np.sqrt(252)
    elif sample_freq == 'w':
        sr = sr * np.sqrt(52)
    elif sample_freq == 'm':
        sr = sr * np.sqrt(12)
    else:
        raise Exception('unkown sample frequency :'+str(sample_freq))
        
    return sr

In [143]:
# Sharpe ratio
sharpe_ratio(
    compute_daily_returns(
        fill_missing_values(
            get_data(
                symbols=['GOOG','SPY','IBM','GLD'],
                             start='4/21/2015', 
                             end='7/15/2016'))))


Out[143]:
GLD     0.559034
GOOG    0.978098
IBM     0.174419
SPY     0.358931
dtype: float64

Summary

For evaluating the performance of a portfolio key factors to focus on are

  1. Cumulative return
  2. Average daily return
  3. Rsk (Standard deviation of daily return)
  4. Sharpe ratio

In [162]:
df = fill_missing_values(get_data(symbols=['GOOG','SPY','IBM','GLD'],
                             start='4/21/2015', 
                             end='7/15/2016'))

In [163]:
# 1. Cumulative return
cumulative_returns(df).ix[-1,:]


Out[163]:
GLD     0.099324
GOOG    0.348103
IBM     0.017407
SPY     0.057167
Name: 2016-07-15 00:00:00, dtype: float64

In [164]:
# 2. Average daily return
compute_daily_returns(df).mean()


Out[164]:
GLD     0.000352
GOOG    0.001113
IBM     0.000155
SPY     0.000229
dtype: float64

In [165]:
# 3. Rsk (Standard deviation of daily return)
compute_daily_returns(df).std()


Out[165]:
GLD     0.010003
GOOG    0.018065
IBM     0.014062
SPY     0.010117
dtype: float64

In [166]:
# 4. Sharpe ratio
sharpe_ratio(compute_daily_returns(df))


Out[166]:
GLD     0.559034
GOOG    0.978098
IBM     0.174419
SPY     0.358931
dtype: float64