Many methods have been suggested for this problem, and I focus on the most popular algorithm of them.
Note that we only consider the computation methods for dynamic game in discrete time model.
Pakes and McGuire (1994) algorithm, which we call PM94, is the algortihm for exact computing equilibrium in the dynamic oligopoly model, which is developed in Ericson and Pakes (1995), which we call EP95, and many researchers use this in both empirical and theoretical works.
But PM94 lacks the consideration about multiple equilibria, which is inevitable in most of game theoretic models, and it is computationally burdensome when the state space of the model has a lot of states. Furthermore the convergence of this algorithm is not guaranteed.
Then more sophisticated methos have been developed in order to these problems, and we introduce these in the other lectures.
In the following, first we descrive the dynamic oliigopoly model of EP95 in order to clarify the problem. And then we explain the algorithm with the Julia code.
In this section I explain the model in consideretion.
EP95 modeled the dynamic oligopoly, in which there are incumbent firms and potential entrant firms for a market. In the begging of each period the incumbent firms decide whether to exit or not by comparing the random scrap value with the continuation value of its own, and the potential entrant firms choose if they entry the market by considering the random setup costs and the continuation value. After their decisions the incumbents compete in the static market of differentiated products. And then they carry out investment for the sake of improving the future state. In the last their decisions about entry/exit are implemented before the next period.
Note that the model we descrive here is a little different from the original one because these alterations offer the condition for equilibrium existence and decrease the computational burden of the algorithms. See Doraszelski and Satterthwaite (2010) and Doraszelski and Pakes (2007) about this point.
In the following we set up the notations of the model for each part.
Denote each incumbent firm by $i$. And the number of incumbents is set to $n$.
We assume that the scrap value is private information, then the decisions about exit of the other firms are stochastic from the perspective of a firm. Thus we can set the probability of remaining of each firm. This is true for the setup costs of the potential entrants.
Denote each potential entrant by $i$.
Then the set of possible industry structures is
$$S = \left\{ \left( \omega_1, \omega_2, \dots, \omega_n \right): \omega_i \in \Omega, n < \bar{n} \right\}$$where $\bar{n}$ is the maximum number of incumbents. And denote the states other than firm $i$ as $\omega_{-i}$
We can reduce the state space by using the symmetry and anonymity assumptions, which are descrived in detial in Doraszelski and Pakes (2007) and in the ch6 of Doraszelski and Satterthwaite (2010). In the end we have the the below compact state space.
$$S^{o} = \left\{ (\omega_i, s): \omega_i \in \Omega, s = (s_1, s_2, \dots, s_{\bar{\omega}}), s_{\omega}\in Z^{+}, \sum_{\omega \in \Omega} s_{\omega} \leq \bar{n} \right\}$$where $s_{\omega}$ denotes the number of incumbents whose state is $\omega$.
In other words, the state space of this model can be characterized by a particular firm's state and the vector of the number of firms for each state in the symmetry and anonymity assumptions.
And the two assumptions make every functions in this model not specific to each firm.
$\omega_i$ represents the quality of the products of firm $i$ in the market. And the mean utility of the product $i$ is given by $g(\omega_i)$. Then the utility for consumer $k$ from buying product $i$ is as follows.
$$u_{i,k} = g(\omega_i) - p_i + \epsilon_{i,k}$$where $p_i$ is the price the firm can choose optimally and $\epsilon_{i,k}$ is taste difference among consumers. And if $(\epsilon_{0,k}, \dots, \epsilon_{n,k})$ identically independently follow extreme value distribution across products and individuals, the demand for each product ($q_i$)can be expressed as below.
$$q_i(p_1, \dots, p_n:\omega) = M\frac{\exp(g(\omega_i)-p_i)}{1 + \sum_j \exp(g(\omega_j) - p_j)}$$where $M$ is the number of consumers.
Then by using the above inverse demand function the optimal price for firm $i$ solves the below maximization problem.
$$\max_{p_i} q_i(p_1, \dots, p_n:\omega) (p_i -c)$$So we can solve the system constructed by the first order condition of each firm. Denote the optimal price by $p_i^{*}$.
Thus the profits for firm $i$ in the current state can be written as
$$\pi(\omega_i, \omega_{-i}) = q_i(p_1^{*}(\omega), \dots, p_n^{*}(\omega) : \omega) (p_i^{*}(\omega) - c)$$Denote the next period state by $\omega_i^{'}$, the realization of investment by $\nu_i$, and the industry wide shock by $\eta$, then the transition of state can be descrived as follows.
$$\omega_i^{'} = \omega_i + \nu_i - \eta$$In this model we restrict $\nu_i \in \left\{ 0,1\right\}$, whose the probability is
$$Pr(\nu | x_i) = \begin{cases} \frac{\alpha x_i}{1+\alpha x_i}\ \text{if}\ \nu = 1 \\[10pt] \frac{1}{1+\alpha x_i}\ \text{if}\ \nu = 0 \end{cases}$$And $\eta \in \left\{ 0,1\right\}$, where $\eta = 1$ with the probability $\delta$.
In short, the investment increase the probability of moving to the improved state while it does not always increase the state.
The potential entrants who enter the market experience $\omega_i^{'} = \omega^e + \nu_i - \eta$, where $\omega^e$ is the initial state for the entrants defined exogenously.
At the beginning of each period, when incumbents draw random scrap values, firm $i$ has a value function which satisfies the below Bellman equation, where $\beta$ is the discount factor.
$$ V(\omega_i, \omega_{-i}, \phi) = \pi(\omega_i, \omega_{-i}) + \max \left\{ \phi, \max_{x_i} \beta E\left[ V(\omega_i^{'}, \omega_{-i}^{'}, \phi^{'})\ |\ \omega_i, \omega_{-i}, x_i\right] -x_i\right\} $$Note that the expectation operator is over the probability distribution of possible next period values.
Now we define the function $W(\nu | \omega_i, \omega_{-i})$ as the expected value of the firm conditional on the outcome of its investment. This means
$$W(\nu\ |\ \omega_i, \omega_{-i}) \equiv \sum_{\omega_{-i}^{'}, \eta} \int_{\phi} V(\omega_i + \nu - \eta, \omega_{-i}^{'}, \phi^{'}) \ \mathrm{d}F_{\phi}(\phi^{'})\ q(\omega_{-i}^{'} \ |\ \omega_i, \omega_{-i}, \eta)\ p(\eta)$$By using this we can transform the expectation term in the Bellman equation as follows
$$E\left[ V(\omega_i^{'}, \omega_{-i}^{'}, \phi^{'})\ |\ \omega_i, \omega_{-i}, x_i\right] = \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x_i)$$Then the Bellman equation is written as below
$$V(\omega_i, \omega_{-i}, \phi) = \pi(\omega_i, \omega_{-i}) + \max \left\{ \phi, \max_{x_i} \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x_i) - x_i\right\}$$Now if we know about $W$, we determine the optimal investment by Kuhn - Tucker condition of the above,
$$x_i\left( \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) \frac{\partial p(\nu\ |\ x_i)}{\partial x_i} - 1 \right) = 0$$In this setting we solve the above as follows
$$x(\omega_i, \omega_{-i}) = \max \left\{ 0, \frac{-1 + \sqrt{\beta\alpha (W(1 | \omega_i, \omega_{-i}) - W(0 | \omega_i, \omega_{-i}))}}{\alpha} \right\}$$Given this optimal investment policy, we next consdier the problem of whether to exit or remain. Because the firms who face this problem must experience the current competition in the market and get the benefit $\pi(\omega_i. \omega_{-i})$, the decision of exiting does not involve the payoff and instead consider only about the scrap value and the continuation value.
Let $\chi(\omega_i, \omega_{-i}, \phi)$ be the indicator function which takes $1$ only if the firm continues. Now this satisfies the below
$$\newcommand{\argmax}{\mathop{\rm arg~max}\limits} \chi(\omega_i, \omega_{-i}, \phi) = \argmax_{\chi \in \left\{0,1\right\}}\ (1 - \chi) \phi + \chi \left( \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x(\omega_i, \omega_{-i})) - x(\omega_i, \omega_{-i})\right)$$We know the distribution of $\phi$, then the probability of $\chi(\omega_i, \omega_{-i}, \phi) = 1$, i.e. the probability that the optimal policy is remainig, denoted as $r(\omega_i, \omega_{-i})$, can be obtained as below.
$$r(\omega_i, \omega_{-i}) = F_{\phi} \left( \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x(\omega_i, \omega_{-i})) - x(\omega_i, \omega_{-i}) \right)$$A potential entrants who chooses enter pays a setup cost drawn from $F_{\psi}$ and becomes an incumbent in the next period. While the preparing period, it can invest $x_i^e$. The problem for entrants is analogous to the one of incumbents, so we omit the detail and show the important equations.
Note that the competitors' state for the potential entrants is $\omega$.
The value function is as follows, where the function $W^e$ is defined analogously in the incumbent's problem.
$$V^e(\omega, \psi) = \max \left\{ 0, max_{x_i^e} -\psi - x_i^e + \beta \sum_{\nu} W^e(\nu\ |\ \omega) p(\nu\ |\ x_i^e)\right\}$$Let $x_i^e(\omega)$ be the optimal initial investment policy calculated by the same logic. Given this, we can construct the optimal entry policy as follows.
$$\chi^e(\omega, \psi) = \argmax_{\chi \in \left\{ 0,1\right\}} \chi \left( -\psi - x^e(\omega) + \beta \sum_{\nu} W^e(\nu\ |\ \omega) p(\nu\ |\ x^e(\omega)) \right)$$Then the entry probability is
$$r^e(\omega) = F_{\psi}\left( - x^e(\omega) + \beta \sum_{\nu} W^e(\nu\ |\ \omega) p(\nu\ |\ x^e(\omega))\right)$$First we have to define the action in this model, where firms decide both exit or entry and the investment.
Let $u(\omega, \phi) = (\chi(\omega, \phi), x(\omega))$ be the policy function for the incumbents and the same is defined for the potential entrants analogously.
The equilibrium we try to compute is a symmetric Markov Perfect Equilirium (MPE), which involves the pair of value function and policy function s.t.
The above condition is also applied to the entrants side.
Thus we compute the value function and the policy function which satisfy the above condition by using the following algorithm.
This algorithm uses the expected value function, which is the function of the state whose value is the expectation of the value function over the distribution of the scrap value for the incumbents and setup cost for the potential entrants.
Thus the expected value function is defined
$$V(\omega_i, \omega_{-i}) \equiv \int_{\phi} V(\omega_i, \omega_{-i}, \phi)\ \mathrm{d}F_{\phi}(\phi)$$Then we can write the Bellman equation, by using the optimal policies, in the below form
$$V(\omega_i, \omega_{-i}) = \pi(\omega_i, \omega_{-i}) + (1-r(\omega_i, \omega_{-i}))\ \phi(\omega, \omega_{-i}) + r(\omega_i, \omega_{-i}) \left\{ \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i})\ p(\nu\ |\ x(\omega_i, \omega_{i}))- x(\omega_i, \omega_{-i}) \right\}$$And now the policy is the pair of the probability of entry and the investment amount, $(r(\omega_i, \omega_{-i}), x(\omega_i, \omega_{-i}))$.
Note that $\phi(\omega_i, \omega_{-i})$ is the expectation of the scrap value conditional on exiting in state $(\omega_i, \omega_{-i})$. This causes endogeneity.
$$\phi(\omega_i, \omega_{-i}) = E_{\phi}\left[ \phi\ |\ \chi(\omega_i, \omega_{-i}, \phi)=0 \right] = \int_{\phi > F_{\phi}^{-1}(r(\omega_i, \omega_{-i}))} \ \phi \ \mathrm{d}F_{\phi}(\phi)$$PM94 proceeds as follows.
Note that this algorithm improves the functions for every state in one iteration. So this is called a "synchronous" algorithm.
We explain each part of iteration in the following.
Recall that the definition of $W$. In order to calculate this we have to know the transition probability of competitors states, i.e. $q(\omega_{-i}^{'}\ |\ \omega_i, \omega_{-i}, \eta)$. This is constructed by the state transition probabilities of each incumbent firm and the entry probability of $\epsilon^{'}$ firms, which are calculated as follows respecively.
$$p^{l}(\omega_i^{'}\ |\ \omega_i, \omega_{-i}, \eta) = \begin{cases} 1 - r^l(\omega_i, \omega_{-i})\ &\text{if}\ \omega_i^{'} = \phi \\[10pt] r^l(\omega_i, \omega_{-i}) P(\nu\ |\ x^l(\omega_i, \omega_{-i}))\ &\text{if}\ \omega_i^{'} = \omega_i + 1 - \eta\\[10pt] r^l(\omega_i, \omega_{-i}) (1-P(\nu\ |\ x^l(\omega_i, \omega_{-i})))\ &\text{if}\ \omega_i^{'} = \omega_i - \eta \end{cases}$$where state $\phi$ means the firm is a potential entrant.
And
$$r^{e,l}(\epsilon^{'}, \omega) = \dbinom{\epsilon}{\epsilon^{'}} \left[ r^{e,l}(\omega)\right]^{\epsilon^{'}} \ \left[ 1-r^{e,l}(\omega)\right]^{\epsilon- \epsilon^{'}}$$Then the competitors' state transition probability is written in below form. Note that $\omega_{j}^{'}$ is the components of $\omega_{-i}^{'}$
$$q^{l+1}(\omega_{-i}^{'}\ |\ \omega_i, \omega_{-i}, \eta) = \Pi_{j\neq i}\ p^l(\omega_j^{'}\ |\ \omega_j, \omega_{-j}, \eta) \ \Pi_{j=1}^{\epsilon^{'}}P(\nu_j\ |\ x^{e,l}(\omega))\ r^{e,l}(\epsilon^{'}, \omega)$$Now by the difinition of $W$, we get the below improvement equation.
$$W^{l+1}(\nu\ |\ \omega_i, \omega_{-i}) = \sum_{\omega_{-i}^{'}\ ,\ \eta} V^l(\omega_i + \nu - \eta, \omega_{-i}^{'})\ q^{l+1}(\omega_{-i}^{'}\ |\ \omega_i, \omega_{-i}, \eta)\ p(\eta)$$By Kuhn - Tucker condition, we have obtained the optimal investment policy as descrived above and therefore we use it as an improvement equation.
$$x^{l+1}(\omega_i, \omega_{-i}) = \max \left\{ 0, \frac{-1 + \sqrt{\beta\alpha (W^{l+1}(1 | \omega_i, \omega_{-i}) - W^{l+1}(0 | \omega_i, \omega_{-i}))}}{\alpha} \right\}$$The remain probability has been already expressed by the distribution function of scrap values. We get the improvement through this equation.
$$r^{l+1}(\omega_i, \omega_{-i}) = F_{\phi} \left( \beta \sum_{\nu} W^{l+1}(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x^{l+1}(\omega_i, \omega_{-i})) - x^{l+1}(\omega_i, \omega_{-i}) \right)$$Because the expected value function satisfies the above equation, so we can update the expected value function as follows.
$$V^{l+1}(\omega_i, \omega_{-i}) = \pi(\omega_i, \omega_{-i}) + (1-r^{l+1}(\omega_i, \omega_{-i}))\ \phi(\omega, \omega_{-i}) + r^{l+1}(\omega_i, \omega_{-i}) \left\{ \beta \sum_{\nu} W^{l+1}(\nu\ |\ \omega_i, \omega_{-i})\ p(\nu\ |\ x^{l+1}(\omega_i, \omega_{i}))- x^{l+1}(\omega_i, \omega_{-i}) \right\}$$Part A ~ C can be applied to the entrants problem without difficulty and we get the improvement about the functions for the entrants as above.
Stopping time problem always matters. You see this issue in detail in Pakes and McGuire (1994) or Doraszelski and Judd (2012).
Here we use the below norm in the criterion
$$\left\| \frac{V^l - V^{l-1}}{1 + |V^l|} \right\| = \max_{\omega \in S^o}\ \left|\frac{V^l(\omega) - V^{l-1}(\omega)}{1 + |V^l(\omega)|}\right|$$And we set the predetermined stopping criterion in the norm where we stop the algorithm if the norm between the current value function and the next one is smaller than it.
We finish explaing the algorithm, so go on to see the Julia code for this.
Most of the default parameter value is the same as in Pakes and McGuire (1994), and the distribution of scrap value and setup cost is set to exponential distribution with parameters $\lambda_1, \lambda_2$ for each.
The conditional expectation with parameter valaue $\lambda$ used in this model is calculated as below.
$$E_{\phi}\left[ \phi\ |\ \chi(\omega_i, \omega_{-i}, \phi)=0 \right] = \int_{\phi > F_{\phi}^{-1}(r(\omega_i, \omega_{-i}))} \ \phi \ \mathrm{d}F_{\phi}(\phi) =\frac{1}{\lambda\ \exp(-\lambda \ F_{\phi}^{-1}(r(\omega_i, \omega_{-i})))} = \frac{1}{\lambda\ (1-r(\omega_i, \omega_{-i}))}$$And the quality specific mean utility function is spcified as follows.
$$g(\omega_i) = \begin{cases} 3\omega_i - 4 \ &\text{if}\ \omega_i \leq 5 \\[10pt] 12 + {\rm log}\ (2 - \exp(16 - 3\omega_i))\ &\text{if}\ \omega_i > 5 \end{cases}$$
In [1]:
# PM94 algorithm
# Default parameter values are as below
# N = 4, M = 5, c = 5, alpha = 3, delta = 0.7, beta = 0.925, max_state = 18, omega_e = 3, lambda_1 = 3, lambda_2 = 2
# Initial guess about the value function for the incumbent is set to pi/(1-beta), where the pi denote the payoff specific to the state.
# Other initial guesses are set to 0 for all state.
using Combinatorics
using Iterators
using StatsBase
using NLsolve
max_state = 18 # state = 1 means the firm is a potential entrant
N = 4
c = 5
M = 5
# First we caculate the payoff function in the static market
# Define the mean utility function
function mean_utility(state)
if state <= 5
return 3*state - 4
else
return 12 + log(2-exp(16-3*state))
end
end
mean_utils = [mean_utility(state) for state in 2:max_state]
# Define the optimal pricing function
function quality_ladder_price(state)
if state_space[state, 1] == 1.0
return 0
else
n = sum(state_space[state,:] .!= 1.0)
incumbents_state = state_space[state, state_space[state,:] .!= 1.0]
function f!(x, fvec)
summation = sum([exp(mean_utils[convert(Int64, k-1)] - x[j]) for (j, k) in enumerate(incumbents_state)])
for i in 1:n
fvec[i] = x[i] - (1 + summation)/(1 + summation - exp(mean_utils[convert(Int64, incumbents_state[i]-1)] - x[i])) - c
end
end
return nlsolve(f!, ones(n))
end
end
# Make state space
num_dist = binomial(N - 1 + max_state -1, N-1)
num_states = max_state *num_dist
distributions = collect(with_replacement_combinations(collect(1:max_state), N-1))
state_space = zeros((num_states, N))
for i in 1:max_state
state_space[(i-1)*num_dist+1:i*num_dist, 1] = zeros(num_dist) + i
for j in 1:num_dist
state_space[(i-1)*num_dist+j, 2:N] = distributions[j]
end
end
# Calculate the optimal payoff from the market
payoff = ones(num_states)
for state in 1:num_states
if quality_ladder_price(state) == 0
payoff[state] = 0
else
incumbents_state = state_space[state, state_space[state,:] .!= 1.0]
optimal_prices = quality_ladder_price(state).zero
summation = sum([exp(mean_utils[convert(Int64, k-1)] - optimal_prices[j]) for (j, k) in enumerate(incumbents_state)])
demand = exp(mean_utils[convert(Int64, incumbents_state[1] - 1)] - optimal_prices[1])/(1 + summation)
profit = M * demand * (optimal_prices[1] - c)
payoff[state] = profit
end
end
In [ ]:
# PM94's algorithm
beta = 0.925
delta = 0.7
alpha = 3
omega_e = 3
# Define the success probability of investment
function success(alpha, state, inv_policy)
inv = inv_policy[state]
return (alpha * inv)/(1 + alpha * inv)
end
# State transition probability of incumbents
# Return is the array whose elements are the tansition probabilities for the possible next period state
# etaで条件付けしてるから3要素
function incum_state_trans(state, r, eta)
trans_prob = ones(3)
own_state = state_space[state, 1]
success_prob = success(alpha, state, incum_inv_policy)
trans_prob[1] = 1- r[state]
if eta == 1
# eta = 1の時は、2つ目の要素が自分より一つ下のstateに映る確率、3つ目が自分に留まり続ける確率
# この中で自分が最低のquality、つまり2の時とそれ以外の時で場合分け
if own_state = 2.0
trans_prob[2] = 0
trans_prob[3] = r[state]
else
trans_prob[2] = r[state] * (1- success_prob)
trans_prob[3] = r[state] * success_prob
end
else
# eta = 0の時は、2つ目の要素が今のstateに留まる確率、3つ目が自分より一つ上のstateに移る確率
# この中で自分が最低のmax_stateの時とそれ以外の時で場合分け
if own_state = 18.0
trans_prob[2] = r[state]
trans_prob[3] = 0
else
trans_prob[2] = r[state] * (1- success_prob)
trans_prob[3] = r[state] * success_prob
end
end
return trans_prob
end
# State transition probability of entrants
function ent_state_trans(state, r_e, eta)
trans_prob = ones(3)
success_prob = success(alpha, state, ent_inv_policy)
trans_prob[1] = 1- r_e[state]
trans_prob[2] = r_e[state] * (1 - success_prob)
trans_prob[3] = r_e[state] * success_prob
return trans_prob
end
# Flatten function
function flatten(x, y)
state = start(x)
if state==false
push!(y, x)
else
while !done(x, state)
(item, state) = next(x, state)
flatten(item, y)
end
end
y
end
flatten(x)=flatten(x,Array(Any, 0))
# Dist_index
function d_index(dist)
dist_index = 0
for k in 1:N-1
dist_index += (dist[k] - 1) * (max_state^(N-1-k))
end
return dist_index
end
# Calculate the transition probability to each distribution
function dist_trans(comp_trans, destination)
b_1 = product(comp_trans[N-2, :], comp_trans[N-1, :])
b_2 = product(destination[N-2, :], destination[N-1, :])
for i in 2:N-2
b_1 = product(comp_trans[N-1-i, :], b_1)
b_2 = product(destination[N-1-i, :], b_2)
end
combination_comp = ones(((N-1)^(N-1), N-1))
combination_dest = ones(((N-1)^(N-1), N-1))
for (j,p) in enumerate(b_1)
combination_comp[j, :] = flatten(p)
end
for (j,p) in enumerate(b_2)
combination_dest[j, :] = sort(flatten(p))
end
# distをdist indexに変換
ind = ones(((N-1)^(N-1), 1))
for k in 1:(N-1)^(N-1)
ind[k, 1] = d_index(combination_dest[k, :])
end
probs = prod(combination_comp, 2)
temp = vcat(ind, probs)
dist_prob = sortrows(temp, by=x->x[1])
dist_prob = vcat(dist_prob, cumsum(dist_prob[:, 2]))
# 各要素の確率の和を出す。Statsbaseのcountmapを使用
num = countmap(dist_prob[:, 1])
counter = 0
i = 1
a = ones((counts(dist_prob, min(dist_prob[:, 1]):max(dist_prob[:, 1])), 2))
for w in distinct(dist_prob[:, 1])
counter += num[w]
a[i, 1] = w
a[i, 2] = dist_prob[counter, 2]
i += 1
end
return a
end
# Competitors' state transition
function comp_trans(r, r_e)
# 辞書を使う
q = Dict{Int64, Dict}()
for i in 1:num_states
# この姉弟がいいのかはちょっとわからない
q[i] = Dict{Int64, AbstractArray}()
end
comp_trans = zeros((N-1, 3))
# 遷移先のstateを入れる
destinations = zeros((N-1, 3))
for i in 1:num_states
for eta in 0:1
for j in 2:N
if j == 2
comp_state = sort([state_space[i, 1], state_space[i, j+1:]])
else
comp_state = sort([state_space[i, 1], state_space[i, 2:j-1], state_space[i, j+1:]])
end
# stateの値からindexを作る
dist_index = 0
for k in 1:N-1
dist_index += (comp_state[k] - 1) * (max_state^(N-1-k))
end
state_index = (state_space[i, j] - 1) * (max_state^(N - 1)) + dist_index
if state_space[i, j] == 1
own_trans = ent_state_trans(state_index, r_e, eta)
if eta == 1
dest = [1, omega_e-1,omega_e]
elseif eta == 0
dest = [1, omega_e, omega_e+1]
end
else
own_trans = incum_state_trans(state_index, r, eta)
if eta == 1
dest = [1, state_space[i,j]-1,state_space[i,j]]
elseif eta == 0
dest = [1, state_space[i,j], state_space[i,j]+1]
end
end
comp_trans[j-1, :] = own_trans
destination[j-1, :] = dest
end
# dist_trans functionを使う(上で定義)
a = dist_trans(comp_trans, destination)
q[i][eta] = a
end
end
return q
end
# Improving W function
function imp_w(r, r_e, incum_value, delta)
q = comp_trans(r, r_e)
end
# Set initial value
incum_value = ones(num_states)
for i in 1:num_states
incum_value[i] = payoff[i]/(1 - beta)
end
incum_policy = ones(num_states)
incum_inv_policy = zeros(num_states)
ent_value = zeros(num_states)
ent_policy = zeros(num_states)
ent_inv_policy = zeros(num_states)
W = zeros((num_states, 2))
In [ ]: