Equilibrium Computation in Dynamic Game 1

Many methods have been suggested for this problem, and I focus on the most popular algorithm of them.

Note that we only consider the computation methods for dynamic game in discrete time model.

Pakes and McGuire (1994) algorithm, which we call PM94, is the algortihm for exact computing equilibrium in the dynamic oligopoly model, which is developed in Ericson and Pakes (1995), which we call EP95, and many researchers use this in both empirical and theoretical works.

But PM94 lacks the consideration about multiple equilibria, which is inevitable in most of game theoretic models, and it is computationally burdensome when the state space of the model has a lot of states. Furthermore the convergence of this algorithm is not guaranteed.

Then more sophisticated methos have been developed in order to these problems, and we introduce these in the other lectures.

In the following, first we descrive the dynamic oliigopoly model of EP95 in order to clarify the problem. And then we explain the algorithm with the Julia code.

EP95's model

In this section I explain the model in consideretion.

EP95 modeled the dynamic oligopoly, in which there are incumbent firms and potential entrant firms for a market. In the begging of each period the incumbent firms decide whether to exit or not by comparing the random scrap value with the continuation value of its own, and the potential entrant firms choose if they entry the market by considering the random setup costs and the continuation value. After their decisions the incumbents compete in the static market of differentiated products. And then they carry out investment for the sake of improving the future state. In the last their decisions about entry/exit are implemented before the next period.

Note that the model we descrive here is a little different from the original one because these alterations offer the condition for equilibrium existence and decrease the computational burden of the algorithms. See Doraszelski and Satterthwaite (2010) and Doraszelski and Pakes (2007) about this point.

In the following we set up the notations of the model for each part.

Incumbent firms

Denote each incumbent firm by $i$. And the number of incumbents is set to $n$.

level of investment : $x_i$
scrap value : $\phi_i \sim \text{i.i.d.}\ F_{\phi}$ where $F_{\phi}$ is the known distribution function.
probability of remaining : $r_i$

We assume that the scrap value is private information, then the decisions about exit of the other firms are stochastic from the perspective of a firm. Thus we can set the probability of remaining of each firm. This is true for the setup costs of the potential entrants.

Potential entrants

Denote each potential entrant by $i$.

number of entrants : $\epsilon$
initial investment : $x_i^e$
setup cost : $\psi_i \sim \text{i.i.d.}\ F_{\psi}$ where $F_{\psi}$ is the known distribution function.
entry probability : $r_i^e$

State

state space : $\Omega = \left\{ 1,2,3, \dots, \bar{\omega} \right\}$ where $\bar{\omega}$ is the upper boud for the state.
industry structure : $\omega = \left( \omega_1, \omega_2, \dots, \omega_n \right)$ for each $\omega_i \in \Omega$ where $\omega_i$ is the state of the firm $i$.

Then the set of possible industry structures is

$$S = \left\{ \left( \omega_1, \omega_2, \dots, \omega_n \right): \omega_i \in \Omega, n < \bar{n} \right\}$$

where $\bar{n}$ is the maximum number of incumbents. And denote the states other than firm $i$ as $\omega_{-i}$

We can reduce the state space by using the symmetry and anonymity assumptions, which are descrived in detial in Doraszelski and Pakes (2007) and in the ch6 of Doraszelski and Satterthwaite (2010). In the end we have the the below compact state space.

$$S^{o} = \left\{ (\omega_i, s): \omega_i \in \Omega, s = (s_1, s_2, \dots, s_{\bar{\omega}}), s_{\omega}\in Z^{+}, \sum_{\omega \in \Omega} s_{\omega} \leq \bar{n} \right\}$$

where $s_{\omega}$ denotes the number of incumbents whose state is $\omega$.

In other words, the state space of this model can be characterized by a particular firm's state and the vector of the number of firms for each state in the symmetry and anonymity assumptions.

And the two assumptions make every functions in this model not specific to each firm.

Product market competition

$\omega_i$ represents the quality of the products of firm $i$ in the market. And the mean utility of the product $i$ is given by $g(\omega_i)$. Then the utility for consumer $k$ from buying product $i$ is as follows.

$$u_{i,k} = g(\omega_i) - p_i + \epsilon_{i,k}$$

where $p_i$ is the price the firm can choose optimally and $\epsilon_{i,k}$ is taste difference among consumers. And if $(\epsilon_{0,k}, \dots, \epsilon_{n,k})$ identically independently follow extreme value distribution across products and individuals, the demand for each product ($q_i$)can be expressed as below.

$$q_i(p_1, \dots, p_n:\omega) = M\frac{\exp(g(\omega_i)-p_i)}{1 + \sum_j \exp(g(\omega_j) - p_j)}$$

where $M$ is the number of consumers.

Then by using the above inverse demand function the optimal price for firm $i$ solves the below maximization problem.

$$\max_{p_i} q_i(p_1, \dots, p_n:\omega) (p_i -c)$$

So we can solve the system constructed by the first order condition of each firm. Denote the optimal price by $p_i^{*}$.

Thus the profits for firm $i$ in the current state can be written as

$$\pi(\omega_i, \omega_{-i}) = q_i(p_1^{*}(\omega), \dots, p_n^{*}(\omega) : \omega) (p_i^{*}(\omega) - c)$$

Investment and state transition

Denote the next period state by $\omega_i^{'}$, the realization of investment by $\nu_i$, and the industry wide shock by $\eta$, then the transition of state can be descrived as follows.

$$\omega_i^{'} = \omega_i + \nu_i - \eta$$

In this model we restrict $\nu_i \in \left\{ 0,1\right\}$, whose the probability is

$$Pr(\nu | x_i) = \begin{cases} \frac{\alpha x_i}{1+\alpha x_i}\ \text{if}\ \nu = 1 \\[10pt] \frac{1}{1+\alpha x_i}\ \text{if}\ \nu = 0 \end{cases}$$

And $\eta \in \left\{ 0,1\right\}$, where $\eta = 1$ with the probability $\delta$.

In short, the investment increase the probability of moving to the improved state while it does not always increase the state.

The potential entrants who enter the market experience $\omega_i^{'} = \omega^e + \nu_i - \eta$, where $\omega^e$ is the initial state for the entrants defined exogenously.

Incumbent's problem

At the beginning of each period, when incumbents draw random scrap values, firm $i$ has a value function which satisfies the below Bellman equation, where $\beta$ is the discount factor.

$$ V(\omega_i, \omega_{-i}, \phi) = \pi(\omega_i, \omega_{-i}) + \max \left\{ \phi, \max_{x_i} \beta E\left[ V(\omega_i^{'}, \omega_{-i}^{'}, \phi^{'})\ |\ \omega_i, \omega_{-i}, x_i\right] -x_i\right\} $$

Note that the expectation operator is over the probability distribution of possible next period values.

Now we define the function $W(\nu | \omega_i, \omega_{-i})$ as the expected value of the firm conditional on the outcome of its investment. This means

$$W(\nu\ |\ \omega_i, \omega_{-i}) \equiv \sum_{\omega_{-i}^{'}, \eta} \int_{\phi} V(\omega_i + \nu - \eta, \omega_{-i}^{'}, \phi^{'}) \ \mathrm{d}F_{\phi}(\phi^{'})\ q(\omega_{-i}^{'} \ |\ \omega_i, \omega_{-i}, \eta)\ p(\eta)$$

By using this we can transform the expectation term in the Bellman equation as follows

$$E\left[ V(\omega_i^{'}, \omega_{-i}^{'}, \phi^{'})\ |\ \omega_i, \omega_{-i}, x_i\right] = \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x_i)$$

Then the Bellman equation is written as below

$$V(\omega_i, \omega_{-i}, \phi) = \pi(\omega_i, \omega_{-i}) + \max \left\{ \phi, \max_{x_i} \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x_i) - x_i\right\}$$

Now if we know about $W$, we determine the optimal investment by Kuhn - Tucker condition of the above,

$$x_i\left( \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) \frac{\partial p(\nu\ |\ x_i)}{\partial x_i} - 1 \right) = 0$$

In this setting we solve the above as follows

$$x(\omega_i, \omega_{-i}) = \max \left\{ 0, \frac{-1 + \sqrt{\beta\alpha (W(1 | \omega_i, \omega_{-i}) - W(0 | \omega_i, \omega_{-i}))}}{\alpha} \right\}$$

Given this optimal investment policy, we next consdier the problem of whether to exit or remain. Because the firms who face this problem must experience the current competition in the market and get the benefit $\pi(\omega_i. \omega_{-i})$, the decision of exiting does not involve the payoff and instead consider only about the scrap value and the continuation value.

Let $\chi(\omega_i, \omega_{-i}, \phi)$ be the indicator function which takes $1$ only if the firm continues. Now this satisfies the below

$$\newcommand{\argmax}{\mathop{\rm arg~max}\limits} \chi(\omega_i, \omega_{-i}, \phi) = \argmax_{\chi \in \left\{0,1\right\}}\ (1 - \chi) \phi + \chi \left( \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x(\omega_i, \omega_{-i})) - x(\omega_i, \omega_{-i})\right)$$

We know the distribution of $\phi$, then the probability of $\chi(\omega_i, \omega_{-i}, \phi) = 1$, i.e. the probability that the optimal policy is remainig, denoted as $r(\omega_i, \omega_{-i})$, can be obtained as below.

$$r(\omega_i, \omega_{-i}) = F_{\phi} \left( \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x(\omega_i, \omega_{-i})) - x(\omega_i, \omega_{-i}) \right)$$

Entrant's problem

A potential entrants who chooses enter pays a setup cost drawn from $F_{\psi}$ and becomes an incumbent in the next period. While the preparing period, it can invest $x_i^e$. The problem for entrants is analogous to the one of incumbents, so we omit the detail and show the important equations.

Note that the competitors' state for the potential entrants is $\omega$.

The value function is as follows, where the function $W^e$ is defined analogously in the incumbent's problem.

$$V^e(\omega, \psi) = \max \left\{ 0, max_{x_i^e} -\psi - x_i^e + \beta \sum_{\nu} W^e(\nu\ |\ \omega) p(\nu\ |\ x_i^e)\right\}$$

Let $x_i^e(\omega)$ be the optimal initial investment policy calculated by the same logic. Given this, we can construct the optimal entry policy as follows.

$$\chi^e(\omega, \psi) = \argmax_{\chi \in \left\{ 0,1\right\}} \chi \left( -\psi - x^e(\omega) + \beta \sum_{\nu} W^e(\nu\ |\ \omega) p(\nu\ |\ x^e(\omega)) \right)$$

Then the entry probability is

$$r^e(\omega) = F_{\psi}\left( - x^e(\omega) + \beta \sum_{\nu} W^e(\nu\ |\ \omega) p(\nu\ |\ x^e(\omega))\right)$$

What we compute

First we have to define the action in this model, where firms decide both exit or entry and the investment.

Let $u(\omega, \phi) = (\chi(\omega, \phi), x(\omega))$ be the policy function for the incumbents and the same is defined for the potential entrants analogously.

The equilibrium we try to compute is a symmetric Markov Perfect Equilirium (MPE), which involves the pair of value function and policy function s.t.

Given the value function, the policy function satisfy the optimal conditions for both entry / exit policy and investment policy.
Given the policy function, the value function satisfy the Bellman equation.

The above condition is also applied to the entrants side.

Thus we compute the value function and the policy function which satisfy the above condition by using the following algorithm.

PM94's exact algorithm

This algorithm uses the expected value function, which is the function of the state whose value is the expectation of the value function over the distribution of the scrap value for the incumbents and setup cost for the potential entrants.

Thus the expected value function is defined

$$V(\omega_i, \omega_{-i}) \equiv \int_{\phi} V(\omega_i, \omega_{-i}, \phi)\ \mathrm{d}F_{\phi}(\phi)$$

Then we can write the Bellman equation, by using the optimal policies, in the below form

$$V(\omega_i, \omega_{-i}) = \pi(\omega_i, \omega_{-i}) + (1-r(\omega_i, \omega_{-i}))\ \phi(\omega, \omega_{-i}) + r(\omega_i, \omega_{-i}) \left\{ \beta \sum_{\nu} W(\nu\ |\ \omega_i, \omega_{-i})\ p(\nu\ |\ x(\omega_i, \omega_{i}))- x(\omega_i, \omega_{-i}) \right\}$$

And now the policy is the pair of the probability of entry and the investment amount, $(r(\omega_i, \omega_{-i}), x(\omega_i, \omega_{-i}))$.

Note that $\phi(\omega_i, \omega_{-i})$ is the expectation of the scrap value conditional on exiting in state $(\omega_i, \omega_{-i})$. This causes endogeneity.

$$\phi(\omega_i, \omega_{-i}) = E_{\phi}\left[ \phi\ |\ \chi(\omega_i, \omega_{-i}, \phi)=0 \right] = \int_{\phi > F_{\phi}^{-1}(r(\omega_i, \omega_{-i}))} \ \phi \ \mathrm{d}F_{\phi}(\phi)$$

PM94 proceeds as follows.

Set the initial values for policy function and value function, i.e. fulfill the value of $r^0, x^o, V^0$ for each state.
In $l$ iteration, implement the below sequence $(0\leq l)$ for every state.
1. Calculate $W^{l+1}$ from $r^l$, $x^{l}$, $V^l$ and the entrants counterparts of them.
2. Improve $x^l$ by using $W^{l+1}$ and denote $x^{l+1}$.
3. Improve $r^l$ and $V^l$ by using $W^{l+1}$ and $x^{l+1}$ and denote $r^{l+1}$ and $V^{l+1}$ respectively.
4. The above three computation is applied to the entrants' problem analogously.
5. Check how much value function improves
If the improvement satisfies the pre determined criterion, we end the iteration. If not, repeat the sequence of 2.

Note that this algorithm improves the functions for every state in one iteration. So this is called a "synchronous" algorithm.

We explain each part of iteration in the following.

Part A

Recall that the definition of $W$. In order to calculate this we have to know the transition probability of competitors states, i.e. $q(\omega_{-i}^{'}\ |\ \omega_i, \omega_{-i}, \eta)$. This is constructed by the state transition probabilities of each incumbent firm and the entry probability of $\epsilon^{'}$ firms, which are calculated as follows respecively.

$$p^{l}(\omega_i^{'}\ |\ \omega_i, \omega_{-i}, \eta) = \begin{cases} 1 - r^l(\omega_i, \omega_{-i})\ &\text{if}\ \omega_i^{'} = \phi \\[10pt] r^l(\omega_i, \omega_{-i}) P(\nu\ |\ x^l(\omega_i, \omega_{-i}))\ &\text{if}\ \omega_i^{'} = \omega_i + 1 - \eta\\[10pt] r^l(\omega_i, \omega_{-i}) (1-P(\nu\ |\ x^l(\omega_i, \omega_{-i})))\ &\text{if}\ \omega_i^{'} = \omega_i - \eta \end{cases}$$

where state $\phi$ means the firm is a potential entrant.

And

$$r^{e,l}(\epsilon^{'}, \omega) = \dbinom{\epsilon}{\epsilon^{'}} \left[ r^{e,l}(\omega)\right]^{\epsilon^{'}} \ \left[ 1-r^{e,l}(\omega)\right]^{\epsilon- \epsilon^{'}}$$

Then the competitors' state transition probability is written in below form. Note that $\omega_{j}^{'}$ is the components of $\omega_{-i}^{'}$

$$q^{l+1}(\omega_{-i}^{'}\ |\ \omega_i, \omega_{-i}, \eta) = \Pi_{j\neq i}\ p^l(\omega_j^{'}\ |\ \omega_j, \omega_{-j}, \eta) \ \Pi_{j=1}^{\epsilon^{'}}P(\nu_j\ |\ x^{e,l}(\omega))\ r^{e,l}(\epsilon^{'}, \omega)$$

Now by the difinition of $W$, we get the below improvement equation.

$$W^{l+1}(\nu\ |\ \omega_i, \omega_{-i}) = \sum_{\omega_{-i}^{'}\ ,\ \eta} V^l(\omega_i + \nu - \eta, \omega_{-i}^{'})\ q^{l+1}(\omega_{-i}^{'}\ |\ \omega_i, \omega_{-i}, \eta)\ p(\eta)$$

Part B

By Kuhn - Tucker condition, we have obtained the optimal investment policy as descrived above and therefore we use it as an improvement equation.

$$x^{l+1}(\omega_i, \omega_{-i}) = \max \left\{ 0, \frac{-1 + \sqrt{\beta\alpha (W^{l+1}(1 | \omega_i, \omega_{-i}) - W^{l+1}(0 | \omega_i, \omega_{-i}))}}{\alpha} \right\}$$

Part C

The remain probability has been already expressed by the distribution function of scrap values. We get the improvement through this equation.

$$r^{l+1}(\omega_i, \omega_{-i}) = F_{\phi} \left( \beta \sum_{\nu} W^{l+1}(\nu\ |\ \omega_i, \omega_{-i}) p(\nu\ |\ x^{l+1}(\omega_i, \omega_{-i})) - x^{l+1}(\omega_i, \omega_{-i}) \right)$$

Because the expected value function satisfies the above equation, so we can update the expected value function as follows.

$$V^{l+1}(\omega_i, \omega_{-i}) = \pi(\omega_i, \omega_{-i}) + (1-r^{l+1}(\omega_i, \omega_{-i}))\ \phi(\omega, \omega_{-i}) + r^{l+1}(\omega_i, \omega_{-i}) \left\{ \beta \sum_{\nu} W^{l+1}(\nu\ |\ \omega_i, \omega_{-i})\ p(\nu\ |\ x^{l+1}(\omega_i, \omega_{i}))- x^{l+1}(\omega_i, \omega_{-i}) \right\}$$

Part D

Part A ~ C can be applied to the entrants problem without difficulty and we get the improvement about the functions for the entrants as above.

Part E

Stopping time problem always matters. You see this issue in detail in Pakes and McGuire (1994) or Doraszelski and Judd (2012).

Here we use the below norm in the criterion

$$\left\| \frac{V^l - V^{l-1}}{1 + |V^l|} \right\| = \max_{\omega \in S^o}\ \left|\frac{V^l(\omega) - V^{l-1}(\omega)}{1 + |V^l(\omega)|}\right|$$

And we set the predetermined stopping criterion in the norm where we stop the algorithm if the norm between the current value function and the next one is smaller than it.

Code

We finish explaing the algorithm, so go on to see the Julia code for this.

Most of the default parameter value is the same as in Pakes and McGuire (1994), and the distribution of scrap value and setup cost is set to exponential distribution with parameters $\lambda_1, \lambda_2$ for each.

The conditional expectation with parameter valaue $\lambda$ used in this model is calculated as below.

$$E_{\phi}\left[ \phi\ |\ \chi(\omega_i, \omega_{-i}, \phi)=0 \right] = \int_{\phi > F_{\phi}^{-1}(r(\omega_i, \omega_{-i}))} \ \phi \ \mathrm{d}F_{\phi}(\phi) =\frac{1}{\lambda\ \exp(-\lambda \ F_{\phi}^{-1}(r(\omega_i, \omega_{-i})))} = \frac{1}{\lambda\ (1-r(\omega_i, \omega_{-i}))}$$

And the quality specific mean utility function is spcified as follows.

$$g(\omega_i) = \begin{cases} 3\omega_i - 4 \ &\text{if}\ \omega_i \leq 5 \\[10pt] 12 + {\rm log}\ (2 - \exp(16 - 3\omega_i))\ &\text{if}\ \omega_i > 5 \end{cases}$$



In [1]:

    
# PM94 algorithm
# Default parameter values are as below
# N = 4, M = 5, c = 5, alpha = 3, delta = 0.7, beta = 0.925, max_state = 18, omega_e = 3, lambda_1 = 3, lambda_2 = 2
# Initial guess about the value function for the incumbent is set to pi/(1-beta), where the pi denote the payoff specific to the state.
# Other initial guesses are set to 0 for all state.

using Combinatorics
using Iterators
using StatsBase
using NLsolve

max_state = 18 # state = 1 means the firm is a potential entrant
N = 4
c = 5
M = 5

# First we caculate the payoff function in the static market

# Define the mean utility function
function mean_utility(state)
    if state <= 5
        return 3*state - 4
    else
        return 12 + log(2-exp(16-3*state))
    end
end
mean_utils = [mean_utility(state) for state in 2:max_state]

# Define the optimal pricing function
function quality_ladder_price(state)
    if state_space[state, 1] == 1.0
        return 0
    
    else
        n = sum(state_space[state,:] .!= 1.0)
        incumbents_state = state_space[state, state_space[state,:] .!= 1.0]

        function f!(x, fvec)
            summation = sum([exp(mean_utils[convert(Int64, k-1)] - x[j]) for (j, k) in enumerate(incumbents_state)])
            for i in 1:n
                fvec[i] = x[i] - (1 + summation)/(1 + summation - exp(mean_utils[convert(Int64, incumbents_state[i]-1)] - x[i])) - c
            end
        end

        return nlsolve(f!, ones(n))
    end
end

# Make state space
num_dist = binomial(N - 1 + max_state -1, N-1)
num_states = max_state *num_dist
distributions = collect(with_replacement_combinations(collect(1:max_state), N-1))
state_space = zeros((num_states, N))
for i in 1:max_state
    state_space[(i-1)*num_dist+1:i*num_dist, 1] = zeros(num_dist) + i
    
    for j in 1:num_dist
        state_space[(i-1)*num_dist+j, 2:N] = distributions[j]
    end
end

# Calculate the optimal payoff from the market
payoff = ones(num_states)
for state in 1:num_states
    if quality_ladder_price(state) == 0
        payoff[state] = 0
    
    else
        incumbents_state = state_space[state, state_space[state,:] .!= 1.0]
        optimal_prices = quality_ladder_price(state).zero
        summation = sum([exp(mean_utils[convert(Int64, k-1)] - optimal_prices[j]) for (j, k) in enumerate(incumbents_state)])
        demand = exp(mean_utils[convert(Int64, incumbents_state[1] - 1)] - optimal_prices[1])/(1 + summation)
        profit = M * demand * (optimal_prices[1] - c)
        payoff[state] = profit
    end
end









    



INFO: Precompiling module Combinatorics.
INFO: Recompiling stale cache file /Users/susu/.julia/lib/v0.5/StatsBase.ji for module StatsBase.
INFO: Recompiling stale cache file /Users/susu/.julia/lib/v0.5/NaNMath.ji for module NaNMath.
INFO: Recompiling stale cache file /Users/susu/.julia/lib/v0.5/NLsolve.ji for module NLsolve.



In [ ]:

    
# PM94's algorithm
beta = 0.925
delta = 0.7
alpha = 3
omega_e = 3

# Define the success probability of investment
function success(alpha, state, inv_policy)
    inv = inv_policy[state]
    return (alpha * inv)/(1 + alpha * inv)
end

# State transition probability of incumbents
# Return is the array whose elements are the tansition probabilities for the possible next period state
# etaで条件付けしてるから3要素
function incum_state_trans(state, r, eta)
    trans_prob = ones(3)
    own_state = state_space[state, 1]
    success_prob = success(alpha, state, incum_inv_policy)
    trans_prob[1] = 1- r[state]
    if eta == 1
        # eta = 1の時は、2つ目の要素が自分より一つ下のstateに映る確率、3つ目が自分に留まり続ける確率
        # この中で自分が最低のquality、つまり2の時とそれ以外の時で場合分け
        if own_state = 2.0
            trans_prob[2] = 0
            trans_prob[3] = r[state]
        else
            trans_prob[2] = r[state] * (1- success_prob)
            trans_prob[3] = r[state] * success_prob
        end
    else
        # eta = 0の時は、2つ目の要素が今のstateに留まる確率、3つ目が自分より一つ上のstateに移る確率
        # この中で自分が最低のmax_stateの時とそれ以外の時で場合分け
        if own_state = 18.0
            trans_prob[2] = r[state]
            trans_prob[3] = 0
        else
            trans_prob[2] = r[state] * (1- success_prob)
            trans_prob[3] = r[state] * success_prob
        end
    end
    return trans_prob
end

# State transition probability of entrants
function ent_state_trans(state, r_e, eta)
    trans_prob = ones(3)
    success_prob = success(alpha, state, ent_inv_policy)
    trans_prob[1] = 1- r_e[state]
    trans_prob[2] = r_e[state] * (1 - success_prob)
    trans_prob[3] = r_e[state] * success_prob
    return trans_prob
end

# Flatten function
function flatten(x, y)
    state = start(x)
    if state==false
        push!(y, x)
    else
        while !done(x, state) 
          (item, state) = next(x, state) 
          flatten(item, y)
        end 
    end
    y
end
flatten(x)=flatten(x,Array(Any, 0))

# Dist_index
function d_index(dist)
    dist_index = 0
    for k in 1:N-1
        dist_index += (dist[k] - 1) * (max_state^(N-1-k))
    end
    return dist_index
end

# Calculate the transition probability to each distribution
function dist_trans(comp_trans, destination)
    b_1 = product(comp_trans[N-2, :], comp_trans[N-1, :])
    b_2 = product(destination[N-2, :], destination[N-1, :])
    for i in 2:N-2
        b_1 = product(comp_trans[N-1-i, :], b_1)
        b_2 = product(destination[N-1-i, :], b_2)
    end

    combination_comp = ones(((N-1)^(N-1), N-1))
    combination_dest = ones(((N-1)^(N-1), N-1))
    
    for (j,p) in enumerate(b_1)
        combination_comp[j, :] = flatten(p)
    end
    
    for (j,p) in enumerate(b_2)
        combination_dest[j, :] = sort(flatten(p))
    end
    
    # distをdist indexに変換
    ind = ones(((N-1)^(N-1), 1))
    for k in 1:(N-1)^(N-1)
        ind[k, 1] = d_index(combination_dest[k, :])
    end
    
    probs = prod(combination_comp, 2)
    temp = vcat(ind, probs)
    dist_prob = sortrows(temp, by=x->x[1])
    dist_prob = vcat(dist_prob, cumsum(dist_prob[:, 2]))
    # 各要素の確率の和を出す。Statsbaseのcountmapを使用
    num = countmap(dist_prob[:, 1])
    counter = 0
    i = 1
    a = ones((counts(dist_prob, min(dist_prob[:, 1]):max(dist_prob[:, 1])), 2))
    for w in distinct(dist_prob[:, 1])
        counter += num[w]
        a[i, 1] = w
        a[i, 2] = dist_prob[counter, 2]
        i += 1
    end
    return a 
end

# Competitors' state transition
function comp_trans(r, r_e)
    # 辞書を使う
    q = Dict{Int64, Dict}()
    for i in 1:num_states
        # この姉弟がいいのかはちょっとわからない
        q[i] = Dict{Int64, AbstractArray}()
    end
    comp_trans = zeros((N-1, 3))
    # 遷移先のstateを入れる
    destinations = zeros((N-1, 3))
    for i in 1:num_states
        for eta in 0:1
            for j in 2:N

                if j == 2
                    comp_state = sort([state_space[i, 1], state_space[i, j+1:]])
                else
                    comp_state = sort([state_space[i, 1], state_space[i, 2:j-1], state_space[i, j+1:]])
                end
                
                # stateの値からindexを作る
                dist_index = 0
                for k in 1:N-1
                    dist_index += (comp_state[k] - 1) * (max_state^(N-1-k))
                end
                state_index = (state_space[i, j] - 1) * (max_state^(N - 1)) + dist_index

                if state_space[i, j] == 1
                    own_trans = ent_state_trans(state_index, r_e, eta)
                    if eta == 1
                        dest = [1, omega_e-1,omega_e]
                    elseif eta == 0
                        dest = [1, omega_e, omega_e+1]
                    end
                        
                else
                    own_trans = incum_state_trans(state_index, r, eta)
                    if eta == 1
                        dest = [1, state_space[i,j]-1,state_space[i,j]]
                    elseif eta == 0
                        dest = [1, state_space[i,j], state_space[i,j]+1]
                    end
                    
                end
                comp_trans[j-1, :] = own_trans
                destination[j-1, :] = dest
            end 
            # dist_trans functionを使う（上で定義）
            a = dist_trans(comp_trans, destination)
            q[i][eta] = a
        end
    end
    return q
end

# Improving W function
function imp_w(r, r_e, incum_value, delta)
    q = comp_trans(r, r_e)
    
    
end

# Set initial value
incum_value = ones(num_states)
for i in 1:num_states
    incum_value[i] = payoff[i]/(1 - beta)
end
incum_policy = ones(num_states)
incum_inv_policy = zeros(num_states)
ent_value = zeros(num_states)
ent_policy = zeros(num_states)
ent_inv_policy = zeros(num_states)
W = zeros((num_states, 2))



In [ ]: