This session delves more deeply into the NumPy and SciPy libraries. Numpy itself is mainly the ndarray object, but this has many features and we will cover its indexing and methods. SciPy builds on NumPy by adding a host of numerical algorithms. In order to make the session more fun, we will use NumPy and SciPy to solve an applied problem. We will construct a program for solving the finite element solution of a simple RBC model. If time permits, we will then try to vectorize this problem to get some speed up.
In economics we are often faced with large, non-linear problems which we would like to solve numerically. We do this in one of two ways: locally or globally. A local approximation is the simplest method and is taught in most first year graduate macro courses, in the form of linearizations. This can also be accomplished through perturbations. However, we sometimes are interested in the solution far from the steady state. In this case, we look for global solutions. Most graduate students are familiar with value- or policy- function iteration, which solve for a policy function over the entire state space. Another global method is called "Finite Elements", or "Projection Methods".
Projection methods get their name from the assumption of a parameterized policy function. This reduces the dimensionality of the problem from an infinite dimensional one to finite dimensions. The name Finite Elements comes from the idea that, given this approximation the solution will be inexact. However, a Galerkin weighted residual approximation will be exact, and thus we can write an integral equation. We then break the state space into "elements", over which we integrate. In this way, we break our problem down to a sequence of smaller problems.
We'll work with the simplest example from McGratten, 1993, the first economics paper on Finite Elements. Consider an economy with a representative agent who must choose a capital stock each period. Productivity follows a finite state markov chain, capital depreciates at a rate $\delta$, and preferences are CRRA with risk aversion parameter $\gamma$. Their problem can be written as
$$ \begin{equation*} \begin{aligned} & \underset{\{ c_t,k_{t+1} \}_{t=0}^\infty}{\text{max}} & & \mathbb{E} {\sum_{t=0}^{\infty}} \beta \frac{c_t^{1-\gamma}}{1-\gamma} \\ & \text{s.t.} & & y_t = a_tk_t^\alpha \\ & & & k_{t+1} = (1 - \delta)k_t + i_t \\ & & & a_t \in \{a_1, ... , a_I\} \\ & & & \mathbb{P}[a_{t+1} = a_j | a_t = a_i] = \Pi_{i,j} \end{aligned} \end{equation*} $$We can solve this problem to arrive at the euler equation and law of motion for capital (the steps are left to the reader):
$$ \begin{align*} c_t^{-\gamma} &= \beta \sum_{i=1}^{I} \Pi_{ij} \left[ c_{t+1}^{-\gamma} \left(1 - \delta + \alpha a_ik_{t+1}^{\alpha - 1} \right) \right]\\ k_{t+1} &= a_ik_t^\alpha +(1 - \delta)k_t - c_t \end{align*} $$for all $i, j$.
At this point we cannot solve this problem exactly, as it is highly non-linear. In order to make it more simple, we can assume that the capital decision rule can be approximated by a parameterized function:
$$ \begin{align*} \hat{k}_{t+1} &= \sum_{l = 1}^{L} N_l(k_t)\kappa_l \end{align*} $$where the $\kappa_l$ are constants
We don't need to make much assumptions about the functions $N_l(\cdot)$, other than that they form a complete basis for the state space. The easiest and most natural method is to use a polynomial basis, eg chebyshev polynomials. However, these functions are not very clear to the first time reader, so we will use a linear basis function:
\begin{align*} N_l(k) &= \left\{ \begin{array}{lll} \frac{k - k_{l-1}}{k_l - k_{l-1}} & k_{l-1}\leq k \leq k_l \\ \frac{k_{l+1} - k}{k_{l-1} - k_l} & k_{l}\leq k \leq k_{l + 1} \\ 0 & else \end{array} \right. \end{align*}This implies that capital can be approximated as a linear interpolant between two constants:
\begin{align*} \hat{k}(k) &= \frac{k_{l+1} - k}{k_{l-1} - k_l}\kappa_l^i + \frac{k - k_{l}}{k_{l + 1} - k_{l}}\kappa_{l+1}^i & k\in[k_l, k_{l+1}] \end{align*}Given this, we can calculate $c_t$, $c_{t+1}$, and $\hat{k_{t+1}}$ conditional on a value for $\kappa$. However, our euler equation is no longer exact. But, we can write the Galerkin weak form integral equation (where $R$ is the euler residual):
\begin{align*} \sum_{i = 1}^{I}\int_{k_0}^{k^L}w(k,i)R(k,i;\kappa)dk =0 \end{align We define the weighting function to be the same linear interpolator: \begin{align*} \sum_{i = 1}^{I} \sum_{a = 1}^{L} w_a^i \left \{\sum_{l=1}^{L - 1} \int_{k_l}^{k_{l+1}}N_a(k)R(k,i;\kappa)dk\right \} =0 \end{align*}If this holds for all values of $w$, then everything inside the braces must be zero. Finally, we arrive at the system of equations we need to solve:
\begin{align*} \sum_{l=1}^{L - 1} \int_{k_l}^{k_{l+1}}N_a(k)R(k,i;\kappa)dk & & \forall i, a \end{align*}There are a few things to notice:
We'll approximate the integral by gaussian quadrature and use a built in solver to find the solution.
Before we set out to solve our problem numerically, we are going to write down EXACTLY what we need to solve. This will allow us to avoid a lot of head scratching along the way. Then we'll try to speed up our code by using NumPy's vectorized functions.
First of all, notice that for a value of $a$, $N_a(\cdot)$ is zero for almost all of the terms. Thus, we have the following system of equations
\begin{align*} \sum_{l=a}^{a + 1} \int_{k_l}^{k_{l+1}}N_a(k)R(k,i;\kappa)dk & & \forall i, a \end{align*}Next, recall that using gaussian quadrature (which we can generate using a NumPy function) for $M$ quadrature points, we have
\begin{align*} \sum_{l=a}^{a + 1} \sum_{m=1}^{M} N_a(k_l^m)R(k_l^m,i;\kappa)\omega^m & & \forall i, a \end{align*}This is a much simplified expression of what we need to solve. Our first shot at the solution will follow the following steps. We'll loop over all $i\in \{1,...,I\}$ and all $l\in \{1, ...,L-1\}$ and do the following:
If that doesn't seem super clear, it will once we get to programming!
In [1]:
#To avoid problems in the future, import everything now
import numpy as np
import scipy.optimize
import scipy.linalg
import time
In [2]:
print("hello")
In [3]:
def mcgratten_weighted_residual(coeff):
"""
A looped function of the McGratten problem.
Variables:
coeff : ndarray; a vector of coefficients for the parameterized
policy function
Returns:
ndarray containing the residuals.
"""
#Unpack the arguments
alpha, delta, I, L, k, abcissas, M, a, PIE = args
coeff = coeff.reshape(I, L)
#Initialize the residuals
RES = np.zeros((I*L))
#Construct the vector of residual funcitons
for i in range(0, I):
for l in range(0, L - 1):
#Compute f at all of the quadrature points on the element
for m in range(0, M):
x = abcissas[m]
kt = scalup(x, k[l+1], k[l])
u = weights[m]
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l+1]*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find the element for next periods capital stock
ii = 0
for h in range(0, L - 1):
if kp >= k[h]:
ii = h
k1p = k[ii]
k2p = k[ii + 1]
#Calculate next period capital policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize summation Variables
sum1 = 0.0
#Compute the summation and derivative simultaneously
for h in range(0, I):
kpp = coeff[h, ii]*bs1p + coeff[h, ii+1]*bs2p
yp = a[h]*kp**alpha
cp = yp + (1 - delta)*kp - kpp
dudcp = cp**(-gamma)
dydkp = alpha*a[h]*kp**(alpha - 1)
sum1 += PIE[i, h]*dudcp*(dydkp + (1 - delta))
ind = l + i*L
#Generate the full residual entry by mulying qua*res*weight
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += res*bs1
RES[ind+1] += res*bs2
return RES
There are several things you should immediately notice about the structure of this function.
First, it's a function! I defined the residual function so that I could pass it to a numerical optimizer.
Second, you'll notice that it is only a function of the coefficients. This is in order to simplify the passing to the numerical optimizer. Often times, we will have functions with additional arguments or keyword arguments. Many SciPy optimizers can handle this, but some cannot. To avoid problems, I like to define a tuple object containing those values that do not change during optimization, such as preference parameters, discount rates, etc.
Third, this is not very Pythonic. There is a looped structure that can be very slow. Our goal later will be to use vectors to speed this up, but for now we'll talk about what the function is doing.
Let's break the function into sections that we can study individually.
In the next few sections the code I present will not be stand alone, but just repition of parts of the above function to discuss.
In [ ]:
#Unpack the arguments
alpha, delta, I, L, k, abcissas, M, a, PIE = args
coeff = coeff.reshape(I, L)
#Initialize the residuals
RES = np.zeros((I*L))
The first thing we do is "unpack" the arguments. We need to define an args
tuple in the global namespace that contains all of these parameters. alpha
is the capital share, delta
depreciation rate, I
number of discrete states in the markov chain, L
the number of points in the partition of capital, k
the partition, abcissas
a set of gaussian abcissas for estimating the integral, a
the different values the markov chain can take, and PIE
the markov transition matrix.
Next we initialize a vector that will contain our solution. Here there are $IL$ equations in $IL$ unknowns.
In [ ]:
#Construct the vector of residual funcitons
for i in range(0, I):
for l in range(0, L - 1):
Next, we loop over all elements, calculating the residuals. There are $I(L-1)$ elements in our problem.
In [1]:
#Compute f at all of the quadrature points on the element
for m in range(0, M):
This isn't as unexpected as it seems. In order to approximate the integral over each element, we use a gaussian quadrature. This uses a polynomial approximation rule. When you tell SciPy you would like quadrature points and weights, you specify the number of points and it returns to you a set of points on $[-1, 1]$ and corresponding weights. It is then your job to scale these to your state space and calculate the function to be integrated at each point.
To this end, we need to define some functions that will map our quadrature points into the corresponding state space and back. Here are those functions:
In [4]:
def scalup(points, upper, lower):
"""
A linear transformation for state variables: [-1,1]-> [k_low, k_high].
Variables:
points : ndarray; a vector of points on [-1,1]
upper : ndarray; a vector of upper bounds for the element in which
each point is to be projected.
lower : ndarray; a vector of lower bounds for the element in which
each point is to be projected.
Returns:
ndarray cotaining the transformed points
"""
return (points + 1.)*(upper - lower)/2 + lower
def scaldwn(points, upper, lower):
"""
A linear transformation for state variables: [k_low, k_high] -> [-1,1].
Variables:
points : ndarray; a vector of points on [k_low, k_high]
upper : ndarray; a vector of upper bounds for the element in which
each point is to be projected.
lower : ndarray; a vector of lower bounds for the element in which
each point is to be projected.
Returns:
ndarray cotaining the transformed points
"""
return 2*(points - lower)/(upper - lower) - 1
Now, we can loop through the $M$ quadrature points on each element to calculate the residual function.
In [2]:
x = abcissas[m]
kt = scalup(x, k[l+1], k[l])
u = weights[m]
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
Given that we are on element $(i, l)$, studying point $m$ in the list of quadrature points, we need to find our value of the state and calculate the linear interpolators.
These lines do just that, taking x
from the list of abcissas, scaling it to the state space on the current element, defining the weight u
, and calculating the interpolator.
In [ ]:
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l+1]*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
Now, given the state and the current solution for the coefficients, we can calculate the policy function for next period capital. Given this, we use the law of motion for capital and the production function to calculate consumption.
In [ ]:
#Find the element for next periods capital stock
ii = 0
for h in range(0, L - 1):
if kp >= k[h]:
ii = h
k1p = k[ii]
k2p = k[ii + 1]
In order to calculate the conditional expectation, we need to calculate the policy function for capital next period as well. That is, in order to find $c_{t+1}$, we need to calculate $k_{t+2}$. Given our assumption of linear interpolators, we then need to find the element on which to calculate next period's capital.
To do this, we loop through the lower bounds of each element and ask the question: "Is $k_{t+1}$ greater than this lower bound?" If so, we consider that to be the lower bound of the element for next period's policy function.
Once we are done, we then define the bounds for next period's element.
NOTE: The assumption of linear interpolators is what makes this necessary. There are other sets of bases, eg Chebyshev polynomials, which are not functions of the bounds of the element. Because of this you do not need to do this look up operation. In particular, were you doing this on several dimensions, this could become quite a pain. However, linear interpolators are the most conceptually approachable, so they are the easiest to start with.
In [ ]:
#Calculate next period capital policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize summation Variables
sum1 = 0.0
#Compute the summation and derivative simultaneously
for h in range(0, I):
kpp = coeff[h, ii]*bs1p + coeff[h, ii+1]*bs2p
yp = a[h]*kp**alpha
cp = yp + (1 - delta)*kp - kpp
dudcp = cp**(-gamma)
dydkp = alpha*a[h]*kp**(alpha - 1)
sum1 += PIE[i, h]*dudcp*(dydkp + (1 - delta))
Given the value of $k_{t+1}$ and the bounds of the element, you can calculate the linear interpolator functions bs1p
and bs2p
.
Now, you need to calculate the capital policy, production, consumption, marginal utility, and marginal productivity of capital for all possible realizations of the state in $t+1$. To do this, loop over the indices $i = 0, ... , I-1$ and calculate all of these values. Then multiply by the probability of this outcome and add it to the sum.
After the loop you will have calculated the conditional expectation:
$$ \sum_{i=1}^{I} \Pi_{ij} \left[ c_{t+1}^{-\gamma} \left(1 - \delta + \alpha a_ik_{t+1}^{\alpha - 1} \right) \right] $$
In [ ]:
ind = l + i*L
#Generate the full residual entry by mulying qua*res*weight
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += res*bs1
RES[ind+1] += res*bs2
Now that you have calculate the functional value at the quadrature point $m$ on element $(i, l)$, you can multiply by the weight and the interpolater. Finally, add it to the residual vector. You'll notice that you are adding it to severl points in the residual vector simultaneously. This is because of the linear intorpolater, which is non-zero on two elements simultaneously.
Now that we've gone through how this code works, let's see it in action. First, we're going to define some functions to generate a steady state and a transition matrix (a seriously sloppy function, but be nice), and define some parameters. Then we'll calculate a partition over the state and some quadrature points. Finally, we'll make an initial guess to the value of capital at the nodes (this is the same as the coefficient values) as half of output. Finally, we package all of our arguments into a tuple and run a jacobian free newton kryolov solver (I like this one both for its speed and its verbosity).
In [8]:
def steady(alpha, beta, delta):
"""
A function to compute the steady state of the simple RBC model.
Variables:
alpha : float; capital share
beta : float; time discount
delta : float; depreciation rate
"""
k_star = ((1 - (1 - delta)*beta)/(beta * alpha))**(1/(alpha - 1))
return k_star, k_star**alpha - delta*k_star
def explicit_transition(states):
"""
A function to generate an explicit transition matrix.
Variables:
states : int; number of finite states in the chain
"""
if states == 5:
PIE = np.array([[0.1, 0.1, 0.6, 0.1, 0.1],
[0.1, 0.1, 0.6, 0.1, 0.1],
[0.1, 0.1, 0.6, 0.1, 0.1],
[0.1, 0.1, 0.6, 0.1, 0.1],
[0.1, 0.1, 0.6, 0.1, 0.1]])
return PIE
# Step 1: Calibrate Parameters
gamma = 2.0
beta = 0.98
alpha = 0.3
delta = 0.9
# Step 2: Compute steady state
k_star, c_star = steady(alpha, beta, delta)
# Step 3: Generate the state space
#Define upper and lower bound of the region of interest around the steady stat
kmin = 0.5*k_star
kmax = 1.5*k_star
#Define the finite states for the markov chain
I = 5
PIE = explicit_transition(I)
a = np.array([0.95, 0.975, 1.0, 1.025, 1.05])
#Generate a state space vector over k
L = 11
k = np.linspace(kmin, kmax, num=L)
# Step 4: Calculate quadrature weights and abcissas
#Specify number of points in the integral on each element [k_j, k_j+1]
M = 10
#Generate absissas and weights
abcissas, weights = np.polynomial.legendre.leggauss(M)
# Step 5: Guess and solve
#Generate intial guess as half of production at all nodes
coeff0 = np.outer(a, k**alpha)*0.1
#Pack up all of the arguments to pass to the solver
args = (alpha, delta, I, L, k, abcissas, M, a, PIE)
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
In [9]:
print(res.shape)
In [34]:
def policy(a_p, k_p, coeff, k, a):
"""
A function to calculate the capital policy at points.
Variables:
points : ndarray; a vector of points on
[k_low, k_high]x{a1...aI}
coeff : ndarray; a vector of coefficients
k : ndarray; a vector containing the
partition of the capital state.
a : ndarray; a vector containing the
partition of the productivity state.
Returns:
ndarray evaluated funciton
"""
#Unpack the points
#k_p = points[:, 0]
#a_p = points[:, 1]
#Find the elements for the points
ll = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in k_p][0])
ll[ll > 0] -= 1
ii = np.array([np.where(point == a)[0] for point in a_p][0])
#ii[ll > 0] -= 1
#Define element bounds for next period
k1p = k[ll]
k2p = k[ll + 1]
#Calculate the interpolators
x = scaldwn(k_p, k2p, k1p)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Return the polciy function of the given points
return coeff[ii, ll]*bs1 + coeff[ii, ll+1]*bs2
In [46]:
#Plot the policy function over the state space
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.gca(projection="3d")
a_points, k_points = np.meshgrid(a, k)
kp = policy(np.reshape(a_points, (I*L)), np.reshape(k_points, (I*L)),
np.array(res), k, a)
kp = np.reshape(kp, (L, I))
ax.plot_surface(a_points, k_points, kp, alpha=0.3)
#cset = ax.contour(k_points, a_points, kp, zdir='z', offset=-2.0)
#cset = ax.contour(C1, C2, utils, zdir='y', offset=10.0)
#cset = ax.contour(C1, C2, utils, zdir='x', offset=1.0)
plt.show()
We can see that our policy function is essentially linear. We knew that, by definition, it would be piecewise linear. We could also expand the state space and increase the number of points to see whether the shape changes:
In [42]:
#Increase the number of points
L = 50
k = np.linspace(kmin, kmax, num=L)
coeff0 = np.outer(a, k**alpha)*0.1
#Pack up all of the arguments to pass to the solver
args = (alpha, delta, I, L, k, abcissas, M, a, PIE)
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
We can see that we don't get much change. Re-run the earlier cell to get back the old paramteters and we'll move on to vectorizing this function.
Given our function, we now want to try to speed it up, as you can see from the previous runs, it was quite slow. Vectorizing it will be a good chance for us to see how much NumPy can increase the speed of your code.
The easiest way (for me) to vectorize something is to work around the loops. Each loop can be thought of as a set of stacked operations, which we can try to do in a single pass, avoiding the loop. If you work in this way, it is easiest to start from the inner-most loop and work out. That is, we have loops over i
, l
, m
, and h
. We'll start with the loop over h
and work our way back.
Here is the loop we would like to vectorize:
In [50]:
#Compute the summation and derivative simultaneously
for h in range(0, I):
kpp = coeff[h, ii]*bs1p + coeff[h, ii+1]*bs2p
yp = a[h]*kp**alpha
cp = yp + (1 - delta)*kp - kpp
dudcp = cp**(-gamma)
dydkp = alpha*a[h]*kp**(alpha - 1)
sum1 += PIE[i, h]*dudcp*(dydkp + (1 - delta))
First, we should ask the question, "What is this loop doing?" It is calculating the conditional expectation of returns on capital next period, discounted by marginal utility.
Given the markov chain randomness in this model, this lends itself very easily to the dot product. We can use NumPy's automatic array operations (which are natively element-wise) to calculate a vector of next period returns ($1 - \delta + \alpha a_ik_{t+1}^{\alpha - 1}$) and then take the dot product with the corresponding row from $\Pi$.
Here is the code, take a look and try to figure out what it is doing and then we will discuss it:
In [52]:
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = a*kp**alpha
cp = yp + (1 - delta)*kp*np.ones(I) - kpp
dudcp = cp**(-gamma)
dydkp = alpha*a*kp**(alpha-1)
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
Got it? Here's how I think of what this is doing. First, notice the parts of the loop that do not change. These are bs1p
, bs2p
, and kp
. You can treat these as scalars. So, you can look at the policy function calculation, kpp
, as taking the linear combination of the vectors of coefficients weighted by bs1p
and bs2p
.
Then you calculate a vector of outputs, each entry corresponding to a different realization of $a$.
Next, calculate consumption, but notice that kp
does not vary, so multiply by a vector of ones. Note: NumPy's broadcasting methods could easily handle this, but I find it clearing to include the vector so those reading your code understand.
In [55]:
temp = 1.
temp2 = np.array([1, 2, 3])
print(temp2 + temp)
Finally, you simply calculate $\partial_c u(c_{t+1})$ and $\partial_k y(k_{t+1})$ as vectors and calculate the expectation by a straight forward dot product.
Here's the full code for the modified function, which I'll call mcgratten_weighted_residual_vec_1
:
In [57]:
def mcgratten_weighted_residual_vec_1(coeff):
"""
A partially vectorized function of the McGratten problem.
Variables:
coeff : ndarray; a vector of coefficients for the parameterized
policy function
Returns:
ndarray containing the residuals.
"""
#Unpack the arguments
alpha, delta, I, L, k, abcissas, M, a, PIE = args
coeff = coeff.reshape(I, L)
#Initialize the residuals
RES = np.zeros((I*L))
#Construct the vector of residual funcitons
for i in range(0, I):
for l in range(0, L - 1):
#Compute f at all of the quadrature points on the element
for m in range(0, M):
x = abcissas[m]
kt = scalup(x, k[l+1], k[l])
u = weights[m]
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l+1]*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find the element for next periods capital stock
ii = 0
for h in range(0, L - 1):
if kp >= k[h]:
ii = h
k1p = k[ii]
k2p = k[ii + 1]
#Calculate next period capital policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize summation Variables
sum1 = 0.0
#Compute the summation and derivative simultaneously
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = a*kp**alpha
cp = yp + (1 - delta)*kp*np.ones(I) - kpp
dudcp = cp**(-gamma)
dydkp = alpha*a*kp**(alpha-1)
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
ind = l + i*L
#Generate the full residual entry by mulying qua*res*weight
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += res*bs1
RES[ind+1] += res*bs2
return RES
Let's run the two optimizations side by side and see the difference in speed, as well as take a norm of the difference in the answers to test for accuracy:
In [64]:
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
time0 = time.time()
res1 = scipy.optimize.newton_krylov(mcgratten_weighted_residual_vec_1, coeff0,
method='lgmres', verbose=True,
maxiter=1000, line_search='armijo')
print(time.time() - time0)
print(np.linalg.norm(res - res1))
It's actually slower by $50%$! Why would we ever want to do this!? What a world...?!
Ok, don't panic. The reason you get a slow down here is because the vector dot product and broadcasting can be slower for small arrays. Where you'll get a speed up is when the arrays become large. The only way in this case to see if this vectorization is effective is to increase the number of discrete states, which is heavy. We'll have an easier time on furth loops and you'll see the benefits then.
Overall, that's the process! We'll now go through all the other loops, vectorizing the operations.
What we'll do is take our previously vectorized function and try to vectorize it further. Here is the loop we want to vectorize:
In [ ]:
for m in range(0, M):
x = abcissas[m]
kt = scalup(x, k[l+1], k[l])
u = weights[m]
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l+1]*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find the element for next periods capital stock
ii = 0
for h in range(0, L - 1):
if kp >= k[h]:
ii = h
k1p = k[ii]
k2p = k[ii + 1]
#Calculate next period capital policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize summation Variables
sum1 = 0.0
#Compute the summation and derivative simultaneously
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = a*kp**alpha
cp = yp + (1 - delta)*kp*np.ones(I) - kpp
dudcp = cp**(-gamma)
dydkp = alpha*a*kp**(alpha-1)
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
ind = l + i*L
#Generate the full residual entry by mulying qua*res*weight
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += res*bs1
RES[ind+1] += res*bs2
You'll notice here a pesky sub-routine that is trying to find indices for the next period's element:
In [ ]:
#Find the element for next periods capital stock
ii = 0
for h in range(0, L - 1):
if kp >= k[h]:
ii = h
k1p = k[ii]
k2p = k[ii + 1]
This is going to be tough to vectorize, but we can do it! Let's see what it is doing. It takes some value, kp
, and returns the index for the next lowest value in another object, k
. To do this vectorially we'll use the numpy.where
function. Let's look at an example.
Given a value of kp
, we would like to return the index of the next lowest value in k
. If kp
$=0.2$, that index is $4$:
In [67]:
kp = 0.2
print(np.where(kp >= k))
k
Out[67]:
Notice though that np.where
gives a vector of indices to all of the values of k where the condition is not true. It also is returning a tuple. Thus, we need to index to the value we want, the interior of the tuple and the last entry in the vector:
In [68]:
print(np.where(kp >= k)[0][-1])
However, what if our algorithm steps out of the state space? For example, what if kp
is zero:
In [70]:
kp = 0.0
print(np.where(kp >= k)[0][-1])
Oh no! This is happening because there are no values in k
for which the condition is not true. What we can do is add a zero to the array and cut off the upper point (this will avoid a similar problem on the upper bound). This gives an answer that is correct for zero, but incorrect for something else:
In [72]:
kp = np.array(0.2)
print(np.where(kp >= np.append([0], k[:-1]))[0][-1])
kp = np.array(0.0)
print(np.where(kp >= np.append([0], k[:-1]))[0][-1])
So, we can just subtract one from all values that are greater than zero. In order to foresee the needing to make this work for a set of points in kp
we can also vectorize the entire calculation using a list comprehension:
In [78]:
kp = np.array([0, 0.2])
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
print(ii)
Now that we've gon over how to use np.where
to find the element, we can vectorize the loop in pretty much the same way as before. Here is the vectorized form of the loop:
In [82]:
#For demonstration purposes
l, i = 0, 0
coeff = coeff0
RES = np.zeros((I*L))
#Compute f at all of the quadrature points on the element
x = abcissas
kt = scalup(x, k[l+1], k[l])
u = weights
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l + 1]*bs2
#print bs1
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next period's capital stock for each point
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Find elements for next period's capital stock
#NOTE: Here I'm using an array as an index
k1p = k[ii]
k2p = k[ii + 1]
#Calculate next period's policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation vector
sum1 = np.zeros((M))
#Compute all of the sums simultaneously
#NOTE: You can slice AND index by an array at the same time!
#NOTE: This is a good time to talk about broadcasting
#The resulting matrix contains columns corresponding to each
#abcissa, and rows corresponding to possible states in period
#t+1
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
#Calculate the residual functions and fill the vector
ind = l + i*L
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += np.dot(res, bs1)
RES[ind + 1] += np.dot(res, bs2)
The first section is fairly straight forward, but the second section deserves some discussion. Instead of building a single vector, we want to build M
vectors simultaneously. To this end, kpp
is a matrix whose columns correspond to the different quadrature points:
In [86]:
kpp
Out[86]:
Similarly, yp
is a matrix of the same size, as is cp
. Additinally, the sum1
variable now is a vector:
In [101]:
sum1
Out[101]:
In the end, we simply use this vector to calculate the integral for each element using a dot product and then add it to the residual:
In [102]:
#Calculate the residual functions and fill the vector
ind = l + i*L
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += np.dot(res, bs1)
RES[ind + 1] += np.dot(res, bs2)
Here is the whole function in one cell, as well as running and comparing our results for the looped and vectorized function:
In [104]:
def mcgratten_weighted_residual_vec_2(coeff):
"""
A partially vectorized function of the McGratten problem.
Variables:
coeff : ndarray; a vector of coefficients for the parameterized
policy function
Returns:
ndarray containing the residuals.
"""
#Unpack the arguments
alpha, delta, I, L, k, abcissas, M, a, PIE = args
coeff = coeff.reshape(I, L)
#Initialize the residuals
RES = np.zeros((I*L))
#Construct the vector of residual funcitons
for i in range(0, I):
for l in range(0, L - 1):
#Compute f at all of the quadrature points on the element
x = abcissas
kt = scalup(x, k[l+1], k[l])
u = weights
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l + 1]*bs2
#print bs1
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next period's capital stock for each point
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Find elements for next period's capital stock
#NOTE: Here I'm using an array as an index
k1p = k[ii]
k2p = k[ii + 1]
#Calculate next period's policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation vector
sum1 = np.zeros((M))
#Compute all of the sums simultaneously
#NOTE: You can slice AND index by an array at the same time!
#NOTE: This is a good time to talk about broadcasting
#The resulting matrix contains columns corresponding to each
#abcissa, and rows corresponding to possible states in period
#t+1
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
#Calculate the residual functions and fill the vector
ind = l + i*L
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += np.dot(res, bs1)
RES[ind + 1] += np.dot(res, bs2)
return RES
In [105]:
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
time0 = time.time()
res2 = scipy.optimize.newton_krylov(mcgratten_weighted_residual_vec_2, coeff0,
method='lgmres', verbose=True,
maxiter=1000, line_search='armijo')
print(time.time() - time0)
print(np.linalg.norm(res - res2))
Huzzah! We got a speed up! Now, we can look at how this change affects speed as we vary the number of quadrature points, M
. When we do that we also need to regenerate weights and abcissas, as well as redefine the args
tuple:
In [106]:
M = 100
#Generate absissas and weights
abcissas, weights = np.polynomial.legendre.leggauss(M)
#Pack up all of the arguments to pass to the solver
args = (alpha, delta, I, L, k, abcissas, M, a, PIE)
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
time0 = time.time()
res2 = scipy.optimize.newton_krylov(mcgratten_weighted_residual_vec_2, coeff0,
method='lgmres', verbose=True,
maxiter=1000, line_search='armijo')
print(time.time() - time0)
print(np.linalg.norm(res - res2))
The solution is now achieved in half the time using 100 quadrature points. We should see similar gains as we vectorize the other loops. Let's reset the arguments so we don't have problems later, then we'll work on the next loop.
In [117]:
M = 10
#Generate absissas and weights
abcissas, weights = np.polynomial.legendre.leggauss(M)
#Pack up all of the arguments to pass to the solver
args = (alpha, delta, I, L, k, abcissas, M, a, PIE)
In [110]:
for l in range(0, L - 1):
#Compute f at all of the quadrature points on the element
x = abcissas
kt = scalup(x, k[l+1], k[l])
u = weights
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#Compute capital next period by projection
kp = coeff[i, l]*bs1 + coeff[i, l + 1]*bs2
#print bs1
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next period's capital stock for each point
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Find elements for next period's capital stock
#NOTE: Here I'm using an array as an index
k1p = k[ii]
k2p = k[ii + 1]
#Calculate next period's policy
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation vector
sum1 = np.zeros((M))
#Compute all of the sums simultaneously
#NOTE: You can slice AND index by an array at the same time!
#NOTE: This is a good time to talk about broadcasting
#The resulting matrix contains columns corresponding to each
#abcissa, and rows corresponding to possible states in period
#t+1
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
#Calculate the residual functions and fill the vector
ind = l + i*L
res = (beta*sum1 - c**(-gamma))*u
RES[ind] += np.dot(res, bs1)
RES[ind + 1] += np.dot(res, bs2)
Now we are going to have to build a series of vectors. This is going to be slightly more complex, as we'll need to use kronecker products to repeat vectors, but once you get the hang of this it is fairly simple. Here's the vectorized loop:
In [127]:
#Vectorize element loop along k axis
#Scale and calculate capital
x = np.kron(np.ones(L - 1), abcissas)
k1 = np.kron(k[:-1], np.ones(M))
k2 = np.kron(k[1:], np.ones(M))
kt = scalup(x, k2, k1)
u = np.kron(np.ones(L - 1), weights)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
Bs1 = 0.5*(1 - abcissas)
Bs2 = 0.5*(1 + abcissas)
#Compute capital next period by projection
kp = np.kron(coeff[i, :-1], np.ones(M))*bs1\
+ np.kron(coeff[i, 1:], np.ones(M))*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next period's element bounds
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Define element bounds for next period
k1p = k[ii]
k2p = k[ii + 1]
#Calculate policy next period
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation
sum1 = np.zeros(((L-1)*M))
#Compute all of the sums simultaneously using linear algebra
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
#Calculate the residual functions
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape(L-1, M)
bs1 = bs1.reshape(L-1, M)
bs2 = bs2.reshape(L-1, M)
RES[i*(L-1): (i+1)*(L-1)] += np.dot(res, Bs1)
RES[i*(L-1) + 1: (i+1)*(L-1) + 1] += np.dot(res, Bs2)
You'll immediately notice the use kroneckers everywhere. This is simply staking up all of the loops performed before into one big vector of length $M(L-1)$, so that there is one point for every abcissa in every element on $k$.
Let's look at the first few lines to get a sense of this:
In [131]:
x = np.kron(np.ones(L - 1), abcissas)
k1 = np.kron(k[:-1], np.ones(M))
k2 = np.kron(k[1:], np.ones(M))
kt = scalup(x, k2, k1)
print(x[:20])
print(k1[:20])
print(k2[:20])
print(kt[:20])
u = np.kron(np.ones(L - 1), weights)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
Bs1 = 0.5*(1 - abcissas)
Bs2 = 0.5*(1 + abcissas)
So x
is just a repetition of the abcissas vector over and over. Then, k1
and k2
define lower and upper bounds, respectively, for the element. Finally, kt
are the abcissa points. Notice that these do not repeat, as each point corresponds to a different point on different elements.
We can continue constructing our solution in a similar way throughout.
Next, the computation of kp
deserves some attention:
In [134]:
kp = np.kron(coeff[i, :-1], np.ones(M))*bs1\
+ np.kron(coeff[i, 1:], np.ones(M))*bs2
print(bs1)
print(np.kron(coeff[i, :-1], np.ones(M)))
print(kp)
This operation is just repeating the coefficients $M$ times in order to carry out an element-wise multiplication with the interpolator. This is the same calculation as before, but carried out $L-1$ times simultaneously.
Next let's see how our index finder is doing:
In [133]:
#Find indices for next period's element bounds
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
print(ii)
It seems to be working just fine. You'll notice that all of the values in kp
are below the lower bound on the state space, so we're simply forcing it back into the lower element.
Finally, let's take a look at the conditional expectation:
In [135]:
#Define element bounds for next period
k1p = k[ii]
k2p = k[ii + 1]
#Calculate policy next period
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation
sum1 = np.zeros(((L-1)*M))
#Compute all of the sums simultaneously using linear algebra
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
In [139]:
print(kpp.shape)
print(yp.shape)
print(cp.shape)
print(sum1.shape)
You'll notice that we just repeated the same code as before. Instead of $M$ columns, the kpp
matrix contains $M(L-1)$ columns, one for every quadrature point and element. Similarly, sum1
now contains all of the conditional expectations at all of the quadrature points on all of the elements in the capital direction.
Finally, we want to calculate the residuals. First of all, notice that we are going to have some trouble entering adding to the vector of residual, since the loop was only over $0...L-2$, while the length of our RES vector is a function of $L$. Why is this? On the lowest and highest elements, we only add once to the residual vector, since there is no overlap with another residual. In order to deal with this we have to do some fancy slicing.
As a test case, let's create a vector of size $IL$ and try to recove only the entries that we would like to fill. Given a value for i
in $\{0, ..., I\}$, we would like to fill all of the rows corresponding to $l + i*L$, for all $l$ in $\{0...L-2\}$, and the same plus one. Thus, we simply slice the test
vector accordingly:
In [150]:
i = 0
test = np.arange(I*L)
print(test)
print(test[i*L:(i+1)*(L-1)])
#Will this work as we loop through i?
print([test[i*L:(i+1)*(L-1)] for i in range(0,I)])
#NO! Why? The upper bound is advancing at a rate L-1 while the lower bound moves at L
print([test[i*L:(i+1)*L-1] for i in range(0,I)])
#That gives us the result we wanted! What about plus one?
print([test[i*L+1:(i+1)*L] for i in range(0,I)])
So, on each pass, given a value for i
we know how to fill RES
, but how do we calculate the integrals? First, notice that bs1
just repeats over and over. So, if we reshape the res
vector into a matrix, we can carry out a dot product with a single interpolator. Here's my implimentation:
In [153]:
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape(L-1, M)
RES[i*L: (i+1)*L-1] += np.dot(res, Bs1)
RES[i*L + 1: (i+1)*L] += np.dot(res, Bs2)
Ok, here's the full function and a comparison with the older version:
In [156]:
def mcgratten_weighted_residual_vec_3(coeff):
"""
A partially vectorized function of the McGratten problem.
Variables:
coeff : ndarray; a vector of coefficients for the parameterized
policy function
Returns:
ndarray containing the residuals.
"""
#Unpack the arguments
alpha, delta, I, L, k, abcissas, M, a, PIE = args
coeff = coeff.reshape(I, L)
#Initialize the residuals
RES = np.zeros((I*L))
#Construct the vector of residual funcitons
for i in range(0, I):
#Vectorize element loop along k axis
#Scale and calculate capital
x = np.kron(np.ones(L - 1), abcissas)
kt = scalup(x, np.kron(k[1:], np.ones(M)), np.kron(k[:-1], np.ones(M)))
u = np.kron(np.ones(L - 1), weights)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
Bs1 = 0.5*(1 - abcissas)
Bs2 = 0.5*(1 + abcissas)
#Compute capital next period by projection
kp = np.kron(coeff[i, :-1], np.ones(M))*bs1\
+ np.kron(coeff[i, 1:], np.ones(M))*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next period's element bounds
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Define element bounds for next period
k1p = k[ii]
k2p = k[ii + 1]
#Calculate policy next period
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation
sum1 = np.zeros(((L-1)*M))
#Compute all of the sums simultaneously using linear algebra
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
#Calculate the residual functions
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape(L-1, M)
RES[i*L: (i+1)*L-1] += np.dot(res, Bs1)
RES[i*L + 1: (i+1)*L] += np.dot(res, Bs2)
return RES
In [158]:
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
time0 = time.time()
res3 = scipy.optimize.newton_krylov(mcgratten_weighted_residual_vec_3, coeff0,
method='lgmres', verbose=True,
maxiter=1000, line_search='armijo')
print(time.time() - time0)
print(np.linalg.norm(res - res3))
This is clearly faster, even for lower values of $M$ and $L$, but let's see how it performs as these increase:
In [160]:
#Generate a state space vector over k
L = 101
k = np.linspace(kmin, kmax, num=L)
#Specify number of points in the integral on each element [k_j, k_j+1]
M = 10
#Generate absissas and weights
abcissas, weights = np.polynomial.legendre.leggauss(M)
#Generate intial guess as half of production at all nodes
coeff0 = np.outer(a, k**alpha)*0.1
#Pack up all of the arguments to pass to the solver
args = (alpha, delta, I, L, k, abcissas, M, a, PIE)
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
time0 = time.time()
res3 = scipy.optimize.newton_krylov(mcgratten_weighted_residual_vec_3, coeff0,
method='lgmres', verbose=True,
maxiter=1000, line_search='armijo')
print(time.time() - time0)
print(np.linalg.norm(res - res3))
On my home computer with $L = 101$ and $M = 100$, I got a four times speed up! Huzzah!
Let's reset our parameters:
In [163]:
#Generate a state space vector over k
L = 11
k = np.linspace(kmin, kmax, num=L)
#Specify number of points in the integral on each element [k_j, k_j+1]
M = 10
#Generate absissas and weights
abcissas, weights = np.polynomial.legendre.leggauss(M)
#Generate intial guess as half of production at all nodes
coeff0 = np.outer(a, k**alpha)*0.1
#Pack up all of the arguments to pass to the solver
args = (alpha, delta, I, L, k, abcissas, M, a, PIE)
In [164]:
for i in range(0, I):
#Vectorize element loop along k axis
#Scale and calculate capital
x = np.kron(np.ones(L - 1), abcissas)
kt = scalup(x, np.kron(k[1:], np.ones(M)), np.kron(k[:-1], np.ones(M)))
u = np.kron(np.ones(L - 1), weights)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
Bs1 = 0.5*(1 - abcissas)
Bs2 = 0.5*(1 + abcissas)
#Compute capital next period by projection
kp = np.kron(coeff[i, :-1], np.ones(M))*bs1\
+ np.kron(coeff[i, 1:], np.ones(M))*bs2
#Compute consumption
y = a[i]*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next period's element bounds
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Define element bounds for next period
k1p = k[ii]
k2p = k[ii + 1]
#Calculate policy next period
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation
sum1 = np.zeros(((L-1)*M))
#Compute all of the sums simultaneously using linear algebra
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
sum1 = np.dot(PIE[i, :], dudcp*(dydkp + 1 - delta))
#Calculate the residual functions
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape(L-1, M)
RES[i*(L-1): (i+1)*(L-1)] += np.dot(res, Bs1)
RES[i*(L-1) + 1: (i+1)*(L-1) + 1] += np.dot(res, Bs2)
And just to move things along, here is the vectorized version:
In [10]:
H = I*(L - 1)
#Vectorize all loops
#Scale and calculate capital
x = np.kron(np.ones(H), abcissas)
kt = scalup(x, np.kron(k[1:], np.ones(I*M)), np.kron(k[:-1], np.ones(I*M)))
u = np.kron(np.ones(H), weights)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#NOTE: bs is constant across elements, so simplify calculations later
Bs1 = 0.5*(1 - abcissas)
Bs2 = 0.5*(1 + abcissas)
#Compute capital next period by projection
kp = np.kron(coeff[:, :-1].reshape(H,), np.ones((M)))*bs1\
+ np.kron(coeff[:, 1:].reshape(H,), np.ones((M,)))*bs2
#Compute consumption
y = np.kron(a, np.ones((L - 1)*M))*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next periods element bounds
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Define element bounds for next period
k1p = k[ii]
k2p = k[ii + 1]
#Calculate policy next period
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation
sum1 = np.zeros((H*M))
#Compute the sums
#Calculate next period's policy, output, and consumption
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
#Calculate marginal product of capital and marginal utility
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
rhs = dudcp*(dydkp + 1 - delta)
#Generate a block diagonal matrix for the expectation
mats = [rhs[:, i*M*(L - 1): (i+1)*M*(L - 1)] for i in range(0, I)]
temp1 = scipy.linalg.block_diag(*mats)
temp2 = PIE.reshape(I*I)
sum1 = np.dot(temp2, temp1)
#Calculate the residual functions
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape((H, M))
RES[[i for i in range(0, I*L) if i not in range((L - 1),I*L,L)]] += np.dot(res, Bs1)
RES[[i for i in range(0, I*L) if i not in range(0, I*L - 1, L)]] += np.dot(res, Bs2)
This code looks almost exactly the same, and this is one of the benefits of vectorization. An increase in dimensions from one to two can be tricky, but from two to three or three to four is usually trivial. Let's look at exactly what's going on.
First, we build a vector kp
the same way as befor, except now it is longer. One point is the reshpae
operation:
In [170]:
kp = np.kron(coeff[:, :-1].reshape(H,), np.ones((M)))*bs1\
+ np.kron(coeff[:, 1:].reshape(H,), np.ones((M,)))*bs2
print(kp.shape)
In order to stack all of the loops as we did befor, we reshape the coefficient matrix. It is important to think about how this reshape operation works. The default method is to allow the last (or rightmost) index to vary the fastest. Let's see an example:
In [175]:
test = np.arange(9)
print(test)
print(test.reshape(3,3))
print(test.reshape(3,3).reshape(9,))
So as you can see, the reshape
method (or corresponding function) will either take items from the first row of a matrix, then the second, and so on. Or, conversely, if reshaping a vector to a matrix it will fill the first row first, the second row second, and etc.
Looking back to kp
, the reshape is just taking the rows of the coeff
matrix and stacking them end-to-end to make a long vector. This is exactly what we want.
The next important point here is the calculation of the integrals. You'll notice that we construct the rhs
matrix, which is $I \times (L-1)I$. We need to take the dot product of the first row of $\Pi$ with the first $L-1$ columns of rhs
, the second with the second, and so forth. This presents a very odd computational challenge, but here is my solution:
We can transform the rhs
matrix up into a block diagonal matrix, where each block corresponds to $L-1$ columns from the original matrix. This matrix will we $I^2\times (L-1)I$. We can then reshape the $\Pi$ matrix into a vector. If we use the reshape method, this will stack the rows end to end. Finally, taking the dot product of these two should give us the desired result!
That's carried out in these lines, where we first create a list of arrays, then pass them to scipy.linalg.block_diag
as a star-args argument (allowing you to pass a dynamic number of matrices), and finally reshaping and taking the dot product:
In [180]:
#Generate a block diagonal matrix for the expectation
mats = [rhs[:, i*M*(L - 1): (i+1)*M*(L - 1)] for i in range(0, I)]
temp1 = scipy.linalg.block_diag(*mats)
temp2 = PIE.reshape(I*I)
sum1 = np.dot(temp2, temp1)
The last point to make is that we need to fill only a subset of the rows of RES
. In two lines we would like to fill first the rows that aren't on the upper bound of capital, and on the next those that aren't on the lower bound of capital. This present a need for some truly fancy slicing.
First, notice that in the first case we want the numbers zero through $L-2$, $L$ through $2L-1$, and so on. Put a different way, we want all of the indices, less those corresponding to $l = L - 1$. We can do this with a fancy slice and using range
. Check it out:
In [208]:
#Here are all the indices
print([i for i in range(0, I*L)])
#Here are all the indices we dont want
#NOTE: The third argument is the step size
print([L-1, I*L, L])
#Here are all the indices we do want!
print([i for i in range(0, I*L) if i not in range((L - 1),I*L,L)])
#And, on the second line, we want all of these indices
print([i for i in range(0, I*L) if i not in range(0, I*L - 1, L)])
Thus, we can reshape the res
object and fill the corresponding rows of RES
using a dot product:
In [210]:
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape((H, M))
RES[[i for i in range(0, I*L) if i not in range((L - 1),I*L,L)]] += np.dot(res, Bs1)
RES[[i for i in range(0, I*L) if i not in range(0, I*L - 1, L)]] += np.dot(res, Bs2)
Here is the full function with a comparison to the original:
In [1]:
def mcgratten_weighted_residual_vec(coeff):
"""
A vectorized function of the McGratten problem.
Variables:
coeff : ndarray; a vector of coefficients for the parameterized
policy function
Returns:
ndarray containing the residuals.
"""
#Unpack the arguments
alpha, delta, I, L, k, abcissas, M, a, PIE = args
#Reshape coefficients in case they are vector
coeff = coeff.reshape(I, L)
#Initialize the residuals
RES = np.zeros((I*L))
#For readability define the number of rows
H = I*(L - 1)
#Construct the vector of residual funcitons
#Vectorize all loops
#Scale and calculate capital
x = np.kron(np.ones(H), abcissas)
kt = scalup(x, np.kron(k[1:], np.ones(I*M)), np.kron(k[:-1], np.ones(I*M)))
u = np.kron(np.ones(H), weights)
bs1 = 0.5*(1 - x)
bs2 = 0.5*(1 + x)
#NOTE: bs is constant across elements, so simplify calculations later
Bs1 = 0.5*(1 - abcissas)
Bs2 = 0.5*(1 + abcissas)
#Compute capital next period by projection
kp = np.kron(coeff[:, :-1].reshape(H,), np.ones((M)))*bs1\
+ np.kron(coeff[:, 1:].reshape(H,), np.ones((M,)))*bs2
#Compute consumption
y = np.kron(a, np.ones((L - 1)*M))*kt**alpha
c = y + (1 - delta)*kt - kp
#Find indices for next periods element bounds
ii = np.array([np.where(point >= np.append([0], k[:-1]))[0][-1]
for point in kp])
ii[ii > 0] -= 1
#Define element bounds for next period
k1p = k[ii]
k2p = k[ii + 1]
#Calculate policy next period
x = scaldwn(kp, k2p, k1p)
bs1p = 0.5*(1 - x)
bs2p = 0.5*(1 + x)
#Initialize the summation
sum1 = np.zeros((H*M))
#Compute the sums
#Calculate next period's policy, output, and consumption
kpp = coeff[:, ii]*bs1p + coeff[:, ii+1]*bs2p
yp = np.outer(a, kp**alpha)
cp = yp + (1 - delta)*np.outer(np.ones(I), kp) - kpp
#Calculate marginal product of capital and marginal utility
dudcp = cp**(-gamma)
dydkp = np.outer(a, kp**(alpha - 1))*alpha
rhs = dudcp*(dydkp + 1 - delta)
#Generate a block diagonal matrix for the expectation
mats = [rhs[:, i*M*(L - 1): (i+1)*M*(L - 1)] for i in range(0, I)]
temp1 = scipy.linalg.block_diag(*mats)
temp2 = PIE.reshape(I*I)
sum1 = np.dot(temp2, temp1)
#Calculate the residual functions
res = (beta*sum1 - c**(-gamma))*u
res = res.reshape((H, M))
RES[[i for i in range(0, I*L) if i not in
range((L - 1), I*L, L)]] += np.dot(res, Bs1)
RES[[i for i in range(0, I*L) if i not in
range(0, I*L - 1, L)]] += np.dot(res, Bs2)
return RES
In [9]:
time0 = time.time()
res = scipy.optimize.newton_krylov(mcgratten_weighted_residual, coeff0,
method='lgmres', verbose=True, maxiter=1000,
line_search='armijo')
print(time.time() - time0)
time0 = time.time()
res_vec = scipy.optimize.newton_krylov(mcgratten_weighted_residual_vec, coeff0,
method='lgmres', verbose=True,
maxiter=1000, line_search='armijo')
print(time.time() - time0)
print(np.linalg.norm(res - res_vec))
You'll notice that the norm is non-zero... ahhhhh!!!! That is your homework: figure out where the mistake is and fix it!