I had recently written an introductory post on simulating short rates in the Hull-White Model. This question on the QuantLib forum raised some interesting questions on the convergence of the Hull-White model simulations. In this post, I discuss the convergence of Monte-Carlo simulations using the Hull-White model.
The Hull-White Short Rate Model is defined as:
$$ dr_t = (\theta(t) - a r_t)dt + \sigma dW_t $$,
where $a$ and $\sigma $ are constants, and $\theta(t)$ is chosen in order to fit the input term structure of interest rates. Here we use QuantLib to show how to simulate the Hull-White model and investigate some of the properties.
The variables used in this post are described below:
timestep
is the number of steps used to discretize the time gridhw_process
the object that defines the Hull-White process, length
is the time span of the simulation in yearslow_discrepancy
is a boolean variable that is used to chose Sobol
low discrepancy random or notbrownnian_bridge
is a boolean that choses brownian bridge for path generation num_paths
is the number of paths in the simulationa
is the constant parameter in the Hull-White modelsigma
is the constant parameter $\sigma$ in the Hull-White model that describes volatility
In [1]:
# <!-- collapse=True -->
import QuantLib as ql
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import simps, cumtrapz, romb
% matplotlib inline
import math
todays_date = ql.Date(15, 1, 2015)
ql.Settings.instance().evaluationDate = todays_date
The get_path_generator
function creates the a path generator. This function takes various inputs such as
In [2]:
# <!-- collapse=True -->
def get_path_generator(timestep, hw_process, length,
low_discrepancy=False, brownian_bridge=True):
"""
Returns a path generator
"""
if low_discrepancy:
usg = ql.UniformLowDiscrepancySequenceGenerator(timestep)
rng = ql.GaussianLowDiscrepancySequenceGenerator(usg)
seq = ql.GaussianSobolPathGenerator(
hw_process, length, timestep, rng,brownian_bridge)
else:
usg = ql.UniformRandomSequenceGenerator(timestep, ql.UniformRandomGenerator())
rng = ql.GaussianRandomSequenceGenerator(usg)
seq = ql.GaussianPathGenerator(
hw_process, length, timestep, rng, brownian_bridge)
return seq
The generate_paths
function uses the generic path generator produced by the get_path_generator
function to return a tuple of the array of the points in the time grid and a matrix of the short rates generated.
In [3]:
# <!-- collapse=True -->
def generate_paths(num_paths, timestep, seq):
arr = np.zeros((num_paths, timestep+1))
for i in range(num_paths):
sample_path = seq.next()
path = sample_path.value()
time = [path.time(j) for j in range(len(path))]
value = [path[j] for j in range(len(path))]
arr[i, :] = np.array(value)
return np.array(time), arr
The generate_paths_zero_price
essentially is a wrapper around generate_path_generator
and generate_paths
taking all the required raw inputs. This function returns the average of zero prices from all the paths for different points in time. I wrote this out so that I can conveniently change all the required inputs and easily plot the results.
In [4]:
# <!-- collapse=True -->
def generate_paths_zero_price(spot_curve_handle, a, sigma, timestep, length,
num_paths, avg_grid_array, low_discrepancy=False,
brownian_bridge=True):
"""
This function returns a tuple (T_array, F_array), where T_array is the array
of points in the time grid, and F_array is the array of the average of zero
prices observed from the simulation.
"""
hw_process = ql.HullWhiteProcess(spot_curve_handle, a, sigma)
seq = get_path_generator(
timestep, hw_process, length, low_discrepancy, brownian_bridge
)
time, paths = generate_paths(num_paths, timestep, seq)
avgs = [(time[j],(np.mean([math.exp(-simps(paths[i][0:j], time[0:j]))
for i in range(num_paths)])))
for j in avg_grid_array
]
return zip(*avgs)
def generate_paths_discount_factors(spot_curve_handle, a, sigma, timestep, length,
num_paths, avg_grid_array, low_discrepancy=False,
brownian_bridge=True):
"""
This function returns a tuple (T_array, S_matrix), where T_array is the array
of points in the time grid, and S_matrix is the matrix of the spot rates for
each path in the different points in the time grid.
"""
hw_process = ql.HullWhiteProcess(spot_curve_handle, a, sigma)
seq = get_path_generator(
timestep, hw_process, length, low_discrepancy, brownian_bridge
)
time, paths = generate_paths(num_paths, timestep, seq)
arr = np.zeros((num_paths, len(avg_grid_array)))
for i in range(num_paths):
arr[i, :] = [np.exp(-simps(paths[i][0:j], time[0:j])) for j in avg_grid_array ]
t_array = [time[j] for j in avg_grid_array]
return t_array, arr
def V(t,T, a, sigma):
""" Variance of the integral of short rates, used below"""
return sigma*sigma/a/a*(T-t + 2.0/a*math.exp(-a*(T-t)) -
1.0/(2.0*a)*math.exp(-2.0*a*(T-t)) - 3.0/(2.0*a) )
In order to understand the convergence of Monte-Carlo for the Hull-White model, let us compare the market discount factor,
$$ P^M(t, T) = \exp\left(-\int_t^Tf^{M} (t,u)du\right)$$with the expectation of the discount factors from the sample of Monte-Carlo paths,
$$P^{MC}(t,T) = E_t\left\{ e^{-\int_t^T r^{MC}(u) du} \right\}$$.
Here $f^{M}(t, T)$ is the instantaneous forward rate implied by the market, and $r^{MC}(s)$ is the instantaneous short rate from the Monte-Carlo simulations. The error in the Monte-Carlo simulation can be defined as:
$$ \epsilon(T) = P^M(0,T) - P^{MC}(0,T) $$As a first step, let us look at the plots of $\epsilon(t)$ for different values of $a$ and $\sigma$.
In [5]:
# <!-- collapse=True -->
# Here we vary sigma with fixed a and observe the error epsilon
# define constants
num_paths = 500
sigma_array = np.arange(0.01,0.1,0.03)
a = 0.1
timestep = 180
length = 15 # in years
forward_rate = 0.05
day_count = ql.Thirty360()
avg_grid_array = np.arange(12, timestep+1, 12)
# generate spot curve
spot_curve = ql.FlatForward(
todays_date,
ql.QuoteHandle(ql.SimpleQuote(forward_rate)), day_count
)
spot_curve_handle = ql.YieldTermStructureHandle(spot_curve)
#initialize plots
figure, axis = plt.subplots()
plots = []
zero_price_theory = np.array([spot_curve.discount(j*float(length)/float(timestep))
for j in avg_grid_array])
for sigma in sigma_array:
term, zero_price_empirical = generate_paths_zero_price(
spot_curve_handle, a, sigma, timestep, length, num_paths,
avg_grid_array
)
plots += axis.plot(
term, np.abs(zero_price_theory - np.array(zero_price_empirical)),
lw=2, alpha=0.6,
label="$\sigma=$"+str(sigma)
)
# plot legend
labels = [p.get_label() for p in plots]
legend =axis.legend(plots,labels, loc=0)#, loc=0, bbox_to_anchor=(1.1,0.4))
axis.set_xlabel("T (years)", size=12)
axis.set_ylabel("|$\epsilon(T)$|", size=12)
axis.set_title("Descount Factor Error for $a=$%0.2f and Varying $\sigma$"%a, size=14)
plt.show()
The above plot illustrates that for $\sigma=0.01$, the Monte-Carlo model shows good convergence, and the convergence gradually deteriorates as $\sigma$ increases and approaches $a$.
In [6]:
# <!-- collapse=True -->
# Here we vary a with fixed sigma and observe the error epsilon
#define constants
num_paths = 500
sigma = 0.1
a_array = np.arange(0.1, 0.51, 0.1)
timestep = 180
length = 15 # in years
forward_rate = 0.05
day_count = ql.Thirty360()
avg_grid_array = np.arange(12, timestep+1, 12)
# generate spot curve
spot_curve = ql.FlatForward(
todays_date,
ql.QuoteHandle(ql.SimpleQuote(forward_rate)), day_count
)
spot_curve_handle = ql.YieldTermStructureHandle(spot_curve)
#initialize plots
figure, axis = plt.subplots()
plots = []
zero_price_theory = np.array([spot_curve.discount(j*float(length)/float(timestep))
for j in avg_grid_array])
for a in a_array:
term, zero_price_empirical = generate_paths_zero_price(
spot_curve_handle, a, sigma, timestep, length, num_paths,
avg_grid_array
)
plots += axis.plot(
term,np.abs(zero_price_theory - np.array(zero_price_empirical)),
lw=2, alpha=0.6,
label="a="+str(a)
)
# plot legend
labels = [p.get_label() for p in plots]
legend =axis.legend(plots,labels, loc=0)#, loc=0, bbox_to_anchor=(1.1,0.4))
axis.set_xlabel("T (years)", size=12)
axis.set_ylabel("|$\epsilon(T)$|", size=12)
axis.set_title("Discount Factor Error for $\sigma$=%0.2f and Varying $a$"%sigma, size=14)
plt.show()
The above plot illustrates that for $a=0.1$ the convergence of Monte-Carlo is poor, and it gradually improves as $a$ increases more than $\sigma$.
From the plots above, we observe that the convergence is good if the ratio $\sigma/a < 1$, and the convergence detiorates as the ratio $\sigma/a$ increases above unity. Now, let us try to formalize this observation from the theoretical footing of the Hull-White model.
The Monte-Carlo approach estimates the market discount factor as the expectation of discount factors from each Monte-Carlo path. If distribution of discount factors has a standard deviation $\sigma_D$, then the error in our estimate of $P^{MC}(t,T)$ on using $N$ paths will be of the order of: $$\epsilon(t,T) \approx \frac{\sigma_D}{\sqrt(N)}. $$
In other words, there are two factors at play in our Monte-Carlo estimate, the number of Monte-Carlo paths $N$ and the standard deviation of the distribution of discount factors $\sigma$. Using more Monte-Carlo paths will lead to improved convergence. But at the same time, the $\sigma_D$ has to be relatively small for us to get a good estimate.
The integral of short rates can be shown to be normally distributed (refer Brigo-Mercurio, second edition page 75), and is given as $$ \int_t^T r(u) du | \mathcal{F}_t \sim \mathcal{N}\left(B(t,T)[r(t)-\alpha(t)] + \ln\frac{P^M(0,t)}{P^M(0,T)} + \frac{1}{2}[V(0,T) - V(0,t)], V(t,T)\right)$$
where, \begin{eqnarray} B(t,T) &=& \frac{1}{a} \left[ 1 - e^{-a(T-t)}\right] \\ V(t,T) &=& \frac{\sigma^2}{a^2}\left[ T - t + \frac{2}{a}e^{-a(T-t)} - \frac{1}{2a}e^{-2a(T-t)} - \frac{3}{2a}\right] \end{eqnarray}
Based on this result, the discount factor from the Monte-Carlo simulation of short rates
$$ P^{MC}(t, T) = \exp\left(- \int_t^T r(u) du | \mathcal{F}_t \right)$$will have a log-normal distribution with a standard deviation
$$ \sigma_D(t,T) = P^M(t,T)\sqrt{e^{V(t,T)} -1 }$$This result follows from the fact that if $X$ is a random process with a normal distribution having mean $\mu$ and standard deviation $\sigma$, then log-normal distribution $Y=e^X$ will satisfy: \begin{eqnarray} E(Y) &=& e^{\mu+\sigma^2/2}\\ Var(Y) &=& (e^{\sigma^2} -1 )E(Y)^2 \end{eqnarray}
In [7]:
# <!-- collapse=True -->
#define constants
num_paths = 500
sigma = 0.02
a = 0.1
timestep = 180
length = 15 # in years
forward_rate = 0.05
day_count = ql.Thirty360()
avg_grid_array = np.arange(1, timestep+1, 12)
# generate spot curve
spot_curve = ql.FlatForward(
todays_date,
ql.QuoteHandle(ql.SimpleQuote(forward_rate)), day_count
)
spot_curve_handle = ql.YieldTermStructureHandle(spot_curve)
term, discount_factor_matrix = generate_paths_discount_factors(
spot_curve_handle, a, sigma, timestep, length, num_paths,
avg_grid_array
)
vol = [np.var(discount_factor_matrix[:, i]) for i in range(len(term))]
l1 = plt.plot(term, 100*np.sqrt(vol),"b", lw=2, alpha=0.6,label="Empirical")
vol_theory = [100*np.sqrt(math.exp(V(0,T,a, sigma))-1.0) *
spot_curve_handle.discount(T) for T in term]
l2 = plt.plot(term, vol_theory,"r--", lw=2, alpha=0.6,label="Theory")
plots = l1+l2
labels = [p.get_label() for p in plots]
legend =plt.legend(plots,labels, loc=0)
plt.xlabel("Time (Years)", size=12)
plt.ylabel("$\sigma_D(0,T)$ (%)", size=12)
plt.title("Standard Deviation of Discount Factors "
"(a=%0.2f, $\sigma$=%0.2f)"%(a, sigma), size=14)
plt.show()
The plot above compares the standard deviation of the discount factors $\sigma_D$ from the closed form expression with a Monte-Carlo estimate. The empirical estimate is in agreement with the theoretical expectation. We can estimate the value of $\sigma_D$ for the asymptotic limit of $T\rightarrow\infty$:
$$ \sigma_D(0, T) \approx P^M(0,T)e^{f^M(0,T)/a-\sigma^2/(4a^3)}\sqrt{e^{\sigma^2T/a^2} - 1}$$The exponential term, $e^{\sigma^2T/a^2}$, can become very large when $\sigma^2T/a^2$ grows above 1. Thus we can expect good convergence when $\sigma^2T/a^2$ remains small or close to zero for the time $T$ of interest to us.
The above result suggests that if the parameters $\sigma$ and $a$ are not chosen carefully, (i.e. $\sigma/a >1$) then the convergence of the simulation would be poor and the results untrustworthy.