囚人のジレンマゲームの実験3

実験の概要: README.md

実験1: 完全観測
実験2: 不完全公的観測
実験3: 不完全私的観測（尾山ゼミの戦略）
実験4: 不完全私的観測（神取ゼミの戦略）
実験5: 不完全私的観測（神取, 尾山ゼミの戦略）

利得表

<table align="center", style="text-align:center;"> 自分の行動, 相手の行動行動0（active）行動1（inactive）行動0（active） 4, 4 0, 5 行動1（inactive） 5, 0 2, 2 </table>



In [1]:

    
#-*- encoding: utf-8 -*-
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import pandas as pd
import scipy.stats as stats
pd.set_option('display.precision', 4)
import sys
sys.path.append('./user_strategies')
# 日本語対応
mpl.rcParams['font.family'] = 'Osaka'
import play as pl
from Iida_perfect_monitoring import Iida_pm
from Iida_imperfect_public import Iida_ipm
from Iida_imperfect_private import Iida_iprm
from kato import KatoStrategy
from ikegami_perfect import Self_Centered_perfect
from ikegami_imperfect_public import Self_Centered_public
from ikegami_imperfect_private import Self_Centered_private
from mhanami_Public_Strategy import PubStrategy
from mhanami_Imperfect_Public_Strategy import ImPubStrategy
from mhanami_Imperfect_Private_Strategy import ImPrivStrategy
from tsuyoshi import GrimTrigger
from gistfile1 import MyStrategy
from beeleb_Strategy import beeleb
from oyama import OyamaPerfectMonitoring, OyamaImperfectPublicMonitoring, OyamaImperfectPrivateMonitoring
from ogawa import ogawa
from yamagishi_impd import yamagishi
from kandori import *

Test

各戦略のテスト



In [2]:

    
import unittest

class TestStrategies(unittest.TestCase):
    def setUp(self):
        self.Strategies = [Iida_pm, Iida_ipm, Iida_iprm, KatoStrategy, Self_Centered_perfect, \
                          Self_Centered_public, Self_Centered_private, PubStrategy, ImPubStrategy, ImPrivStrategy, \
                          MyStrategy, beeleb, OyamaPerfectMonitoring, \
                           OyamaImperfectPublicMonitoring, OyamaImperfectPrivateMonitoring, \
                          ogawa, yamagishi, GrimTrigger, Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24, ] # ここに自作のclassを入れる
        self.case1 = "Signal is empty(period 1)"
        self.case2 = [0, 1]
        self.case3 = [1, 0]
        self.case4 = [0, 1, 0, 1, 0, 0, 1]
        self.seed = 222
        self.RandomState = np.random.RandomState(self.seed)


    # case1を引数に渡してテスト
    def test1(self):
        print("testcase:", self.case1)
        for Strategy in self.Strategies:
            rst = Strategy(self.RandomState).play()
            self.assertIsNotNone(rst, Strategy.__module__)
            self.assertIn(rst, (0, 1), Strategy.__module__)

    # case2を引数に渡してテスト
    def test2(self):
        print("testcase:", self.case2)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case2:
                rst = S.play()
                S.get_signal(signal)
                self.assertIsNotNone(rst, Strategy.__module__)
                self.assertIn(rst, (0, 1), Strategy.__module__)

    # case3を引数に渡してテスト
    def test3(self):
        print("testcase:", self.case3)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case3:
                rst = S.play()
                S.get_signal(signal)
            
            self.assertIsNotNone(rst, S.__module__)
            self.assertIn(rst, (0, 1), S.__module__)

    # case4を引数に渡してテスト
    def test4(self):
        print("testcase:", self.case4)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case4:
                rst = S.play()
                S.get_signal(signal)
                self.assertIsNotNone(rst, S.__module__)
                self.assertIn(rst, (0, 1), S.__module__)



In [3]:

    
suite = unittest.TestLoader().loadTestsFromTestCase(TestStrategies)
unittest.TextTestRunner().run(suite)









    



....





    



testcase: Signal is empty(period 1)
testcase: [0, 1]
testcase: [1, 0]
testcase: [0, 1, 0, 1, 0, 0, 1]






    



----------------------------------------------------------------------
Ran 4 tests in 0.005s

OK






    Out[3]:





<unittest.runner.TextTestResult run=4 errors=0 failures=0>

Test: OK

実験のセットアップ



In [4]:

    
payoff = np.array([[4, 0], [5, 2]])
seed = 282
rs = np.random.RandomState(seed)
discount_v = 0.97
repeat = 1000
ts_length = rs.geometric(p=1-discount_v, size=1000)

1000セッションの期数の長さの分布

確率関数は $P(x)=0.03(0.97)^{x-1}, x=1, 2, ...$。平均は33.33。確率関数は単調減少で、分布関数 $F(x \leq 33) ≒ 0.65$。



In [5]:

    
print("基本統計量:")
print(pd.DataFrame(ts_length, columns=["ts_length"]).describe())

print("\n33.33期未満: {0}%".format(ts_length[ts_length <= 33].size / 10))

fig, ax = plt.subplots(figsize=(20, 5))
plt.title("1000セッションの期数の分布")
# actual histogram
plt.hist(ts_length, bins=np.max(ts_length)-1, color='#4488FF')
# theoretical cdf
x = np.arange(1, np.max(ts_length))
plt.plot(x, stats.geom.pmf(x, 1-discount_v)*1000, linewidth=2, color='green', label="theoretical cdf(average=33.33)")
mu = np.mean(ts_length)
sigma = np.var(ts_length)
plt.xlabel("ts_length")
plt.ylabel("number of session")
ax.text(35, 30, r'''$\mu$={0:.3f}, $\sigma^2$={1:.3f}'''.format(mu, sigma), ha = 'left', va = 'bottom', size=15)
ax.grid(True)
ax.axvline(x=mu, linewidth=2, color='red', label="average")
plt.legend()
plt.show()









    



基本統計量:
       ts_length
count   1000.000
mean      32.856
std       33.934
min        1.000
25%        9.000
50%       23.000
75%       45.000
max      287.000

33.33期未満: 63.6%

期数の長いセッション・短いセッションを5%ずつトリムすると



In [6]:

    
trimmed_ts_length = np.sort(ts_length)[50:950]
print("基本統計量:")
print(pd.DataFrame(trimmed_ts_length, columns=["trimmed_ts_length"]).describe())

print("\n33.33期未満: {0}%".format(trimmed_ts_length[trimmed_ts_length <= 33].size / 10))

mu = np.mean(trimmed_ts_length)
sigma = np.var(trimmed_ts_length)

fig, ax = plt.subplots(figsize=(20, 5))
plt.title("900セッションの期数の分布")

# actual histogram
plt.hist(trimmed_ts_length, bins=np.max(trimmed_ts_length)-1, color='#4488FF')

# theoretical cdf
x = np.arange(1, np.max(trimmed_ts_length))
plt.plot(x, stats.geom.pmf(x, 1-discount_v)*1000, linewidth=2, color='green', label="theoretical cdf(average=33.33)")

plt.xlabel("trimmed_ts_length")
plt.ylabel("number of session")
ax.text(30, 30, r'''$\mu$={0:.3f}, $\sigma^2$={1:.3f}'''.format(mu, sigma), ha = 'left', va = 'bottom', size=14)
ax.grid(True)
ax.axvline(x=mu, linewidth=2, color='red')
plt.legend()
plt.show()









    



基本統計量:
       trimmed_ts_length
count            900.000
mean              28.756
std               22.608
min                2.000
25%               10.000
50%               23.000
75%               42.000
max              100.000

33.33期未満: 58.6%

2〜100期の範囲になる。

Case1: perfect monitoring

結果の生データ(csv)は contest1/data
戦略はuser_strategies
戦略のオートマトンはcontest1/automaton1.pdf



In [7]:

    
strategies = [Iida_pm, PubStrategy, KatoStrategy, Self_Centered_perfect,
                       GrimTrigger, MyStrategy, beeleb, OyamaPerfectMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=None, ts_length=ts_length, repeat=1000)
game.play(mtype="perfect", random_seed=seed, record=False)









    



Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_perfect_monitoring.Iida_pm
2. mhanami_Public_Strategy.PubStrategy
3. kato.KatoStrategy
4. ikegami_perfect.Self_Centered_perfect
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaPerfectMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.396  2.514  3.952  3.807  4.146  4.146  3.394  3.588  3.803]
 [ 3.519  0.     2.428  4.     4.     4.     4.     4.     3.315  4.   ]
 [ 2.912  2.234  0.     2.292  3.408  3.973  3.814  2.234  3.641  2.906]
 [ 3.463  4.     2.46   0.     4.     4.     4.     4.     3.459  4.   ]
 [ 3.292  4.     1.893  4.     0.     4.     4.     4.     3.374  4.   ]
 [ 3.415  4.     2.31   4.     4.     0.     4.     4.     3.479  4.   ]
 [ 3.415  4.     2.416  4.     4.     4.     0.     4.     3.534  4.   ]
 [ 3.518  4.     2.428  4.     4.     4.     4.     0.     3.315  4.   ]
 [ 3.257  3.254  2.501  3.904  3.792  3.897  3.815  3.254  0.     3.643]
 [ 3.784  4.     2.69   4.     4.     4.     4.     4.     3.595  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     2.794  2.198  3.695  3.627  4.285  4.285  2.788  3.107  3.702]
 [ 3.055  0.     2.17   4.     4.     4.     4.     4.     2.82   4.   ]
 [ 2.491  2.037  0.     2.066  3.418  3.595  3.211  2.037  3.044  2.524]
 [ 3.009  4.     2.194  0.     4.     4.     4.     4.     2.926  4.   ]
 [ 2.569  4.     1.402  4.     0.     4.     4.     4.     2.708  4.   ]
 [ 2.859  4.     1.93   4.     4.     0.     4.     4.     2.908  4.   ]
 [ 2.859  4.     2.186  4.     4.     4.     0.     4.     3.072  4.   ]
 [ 3.051  4.     2.17   4.     4.     4.     4.     0.     2.82   4.   ]
 [ 2.805  2.767  2.276  3.618  3.684  3.666  3.418  2.767  0.     3.217]
 [ 3.686  4.     2.418  4.     4.     4.     4.     4.     3.157  0.   ]]

Ranking:
1. "yamagishi_impd.yamagishi" -> セッションを重率1で平均: 3.786, ステージゲームを重率1で平均: 3.696
2. "ikegami_perfect.Self_Centered_perfect" -> セッションを重率1で平均: 3.709, ステージゲームを重率1で平均: 3.570
3. "beeleb_Strategy.beeleb" -> セッションを重率1で平均: 3.707, ステージゲームを重率1で平均: 3.569
4. "mhanami_Public_Strategy.PubStrategy" -> セッションを重率1で平均: 3.696, ステージゲームを重率1で平均: 3.561
5. "oyama.OyamaPerfectMonitoring" -> セッションを重率1で平均: 3.696, ステージゲームを重率1で平均: 3.560
6. "gistfile1.MyStrategy" -> セッションを重率1で平均: 3.689, ステージゲームを重率1で平均: 3.522
7. "Iida_perfect_monitoring.Iida_pm" -> セッションを重率1で平均: 3.638, ステージゲームを重率1で平均: 3.387
8. "tsuyoshi.GrimTrigger" -> セッションを重率1で平均: 3.618, ステージゲームを重率1で平均: 3.409
9. "ogawa.ogawa" -> セッションを重率1で平均: 3.480, ステージゲームを重率1で平均: 3.135
10. "kato.KatoStrategy" -> セッションを重率1で平均: 3.046, ステージゲームを重率1で平均: 2.714

Summary

Str_numbers	Strategy name	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(stage based)	備考
Datetime	2015-12-04-18-23-58
Monitoring type	perfect
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	10
10	yamagishi_impd.yamagishi	3.785542333	1	3.69566526	1	TFT
4	ikegami_perfect.Self_Centered_perfect	3.709163441	2	3.569823202	2	30%
7	beeleb_Strategy.beeleb	3.707132242	3	3.56862944	3
2	mhanami_Public_Strategy.PubStrategy	3.695745756	4	3.560563942	4	TFT'
8	oyama.OyamaPerfectMonitoring	3.695585282	5	3.560097259	5	GT
6	gistfile1.MyStrategy	3.689240046	6	3.521959459	6	TFT'
1	Iida_perfect_monitoring.Iida_pm	3.638486141	7	3.386765144	8
5	tsuyoshi.GrimTrigger	3.617644214	8	3.408783784	7	TFT'
9	ogawa.ogawa	3.479693556	9	3.13531775	9
3	kato.KatoStrategy	3.046090794	10	2.713737386	10

TFT = Tit for Tat, GT = GrimTrigger, 30% = {過去全てのシグナルが30%以上BならD、そうでなければCを返す}
TFT' = Tit for Tatの亜種（確率を分岐条件に加える, stateを増やすなど）

全体的に平均利得が4に近い（= 協調がかなりの程度達成されている）。
TFTが圧倒的に高い利得を得た一方で、定期的に必ずDを出す戦略（戦略9, 3）は平均利得が低くなっている。

戦略別セッション平均の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%



In [8]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest1/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*(strategies-1), strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量

str number	1	2	3	4	5	6	7	8	9	10
rank	7	4	10	2	8	6	3	5	9	1
count	18000	18000	18000	18000	18000	18000	18000	18000	18000	18000
mean	3.638486	3.695746	3.046091	3.709163	3.617644	3.68924	3.707132	3.695585	3.479694	3.785542
std	0.649774	0.604642	0.874775	0.58606	0.7979	0.634951	0.588227	0.604742	0.672512	0.469952
min	1.801394	2	2.003484	2	0.828571	1.41791	2	2	2	2
25%	3.3	4	2.142857	4	4	4	4	4	2.866667	4
50%	4	4	3	4	4	4	4	4	3.892857	4
75%	4	4	3.931034	4	4	4	4	4	4	4
max	4.466899	4	4.5	4	4	4	4	4	4.4	4

1位のTFTが最も標準偏差が小さい。全体的に、分散が小さいほど高い順位となった。

期数による平均利得の変化



In [9]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest1/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [10, 8, 4]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[10-1], color='red', linewidth=2, label="10 (TFT)")
plt.plot(t_list, average_matrix[8-1], color='blue', linewidth=2, label="8 (GrimTrigger)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (30%)")
plt.legend()
plt.show()

TFTは期数の長いセッションで、他の戦略に比べて協調に成功している。

Case2: imperfect public monitoring

結果の生データ(csv)は contest2/data
戦略はuser_strategies
戦略のオートマトンはcontest2/automaton2.pdf



In [10]:

    
# プロジェクトが成功か失敗かを返す
def public_signal(actions, random_state):
    prob = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return 0 if prob < 0.9 else 1
    elif (actions[0] == 0 and actions[1] == 1) or (actions[0] == 1 and actions[1] == 0):
        return 0 if prob < 0.5 else 1
    elif actions[0] == 1 and actions[1] == 1:
        return 0 if prob < 0.2 else 1
    else:
        raise ValueError

strategies = [Iida_ipm, ImPubStrategy, KatoStrategy, Self_Centered_public, GrimTrigger,
              MyStrategy, beeleb, OyamaImperfectPublicMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=public_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="public", random_seed=seed, record=False)









    



Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_imperfect_public.Iida_ipm
2. mhanami_Imperfect_Public_Strategy.ImPubStrategy
3. kato.KatoStrategy
4. ikegami_imperfect_public.Self_Centered_public
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaImperfectPublicMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     1.283  2.175  3.073  3.767  4.099  4.08   3.118  3.46   3.329]
 [ 3.076  0.     2.473  2.593  4.045  4.412  4.283  3.088  3.678  3.19 ]
 [ 3.005  1.684  0.     2.824  3.821  4.251  4.185  3.113  3.621  3.202]
 [ 3.117  1.604  2.296  0.     3.749  4.046  4.014  3.225  3.423  3.369]
 [ 2.743  0.636  1.742  3.017  0.     3.999  4.062  2.927  3.319  3.42 ]
 [ 2.636  0.392  1.661  3.114  3.576  0.     3.99   2.912  3.296  3.455]
 [ 2.621  0.478  1.753  3.128  3.539  3.969  0.     2.954  3.342  3.43 ]
 [ 3.107  1.275  2.253  3.175  3.745  4.071  3.997  0.     3.511  3.344]
 [ 2.892  0.881  2.005  3.175  3.668  4.049  4.002  3.126  0.     3.392]
 [ 3.002  1.207  2.043  3.103  3.737  3.959  3.998  3.133  3.362  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     1.493  1.927  2.67   3.716  4.148  4.027  2.523  2.939  3.105]
 [ 2.761  0.     2.166  2.2    3.898  4.285  3.929  2.488  3.087  2.96 ]
 [ 2.725  1.89   0.     2.427  3.791  4.238  3.992  2.568  3.125  2.99 ]
 [ 2.759  1.867  2.126  0.     3.673  4.053  3.943  2.714  2.892  3.194]
 [ 2.177  0.735  1.311  2.705  0.     3.99   4.053  2.206  2.618  3.303]
 [ 2.009  0.477  1.185  2.87   3.463  0.     3.983  2.179  2.533  3.352]
 [ 2.058  0.714  1.396  3.039  3.435  3.963  0.     2.304  2.652  3.346]
 [ 2.817  1.675  2.14   2.97   3.746  4.126  3.923  0.     3.064  3.155]
 [ 2.571  1.275  1.866  2.931  3.656  4.111  3.942  2.646  0.     3.2  ]
 [ 2.577  1.36   1.749  2.841  3.684  3.936  3.939  2.544  2.789  0.   ]]

Ranking:
1. "mhanami_Imperfect_Public_Strategy.ImPubStrategy" -> セッションを重率1で平均: 3.427, ステージゲームを重率1で平均: 3.086
2. "kato.KatoStrategy" -> セッションを重率1で平均: 3.301, ステージゲームを重率1で平均: 3.083
3. "ikegami_imperfect_public.Self_Centered_public" -> セッションを重率1で平均: 3.205, ステージゲームを重率1で平均: 3.025
4. "oyama.OyamaImperfectPublicMonitoring" -> セッションを重率1で平均: 3.164, ステージゲームを重率1で平均: 3.069
5. "Iida_imperfect_public.Iida_ipm" -> セッションを重率1で平均: 3.154, ステージゲームを重率1で平均: 2.950
6. "yamagishi_impd.yamagishi" -> セッションを重率1で平均: 3.060, ステージゲームを重率1で平均: 2.824
7. "ogawa.ogawa" -> セッションを重率1で平均: 3.021, ステージゲームを重率1で平均: 2.911
8. "tsuyoshi.GrimTrigger" -> セッションを重率1で平均: 2.874, ステージゲームを重率1で平均: 2.566
9. "beeleb_Strategy.beeleb" -> セッションを重率1で平均: 2.802, ステージゲームを重率1で平均: 2.545
10. "gistfile1.MyStrategy" -> セッションを重率1で平均: 2.781, ステージゲームを重率1で平均: 2.450

Summary

Str_numbers	Strategy name	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(stage based)	備考
Datetime	2015-12-04-19-27-41
Monitoring type	public
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	10
2	mhanami_Imperfect_Public_Strategy.ImPubStrategy	3.426505902	1	3.085798636	1	ALLD
3	kato.KatoStrategy	3.300688557	2	3.082721235	2
4	ikegami_imperfect_public.Self_Centered_public	3.204767877	3	3.024595541	4	25%
8	oyama.OyamaImperfectPublicMonitoring	3.164308602	4	3.06851446	3	GT'
1	Iida_imperfect_public.Iida_ipm	3.15370844	5	2.949672646	5
10	yamagishi_impd.yamagishi	3.060373179	6	2.824435922	7	TFT
9	ogawa.ogawa	3.02123959	7	2.910746896	6
5	tsuyoshi.GrimTrigger	2.873939966	8	2.566333225	8	TFT'
7	beeleb_Strategy.beeleb	2.80155086	9	2.545187079	9
6	gistfile1.MyStrategy	2.781473904	10	2.45014271	10	TFT'

戦略2（ALLD）と戦略3（定期的にDを出す戦略）が上位となった。

戦略別セッション平均の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%



In [11]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest2/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*(strategies-1), strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量

str number	1	2	3	4	5	6	7	8	9	10
rank	5	1	2	3	8	10	9	4	7	6
count	18000	18000	18000	18000	18000	18000	18000	18000	18000	18000
mean	3.153708	3.426506	3.300689	3.204768	2.87394	2.781474	2.801551	3.164309	3.02124	3.060373
std	1.068773	0.989293	0.981721	0.932929	1.311979	1.35814	1.319016	1.042236	1.180943	1.091448
min	0	2.010453	0	0	0	0	0	0	0	0
25%	2.215385	2.545455	2.496503	2.425827	1.653061	1.555556	1.672578	2.406039	2	2
50%	3.545455	3.3125	3.416667	3.5	3.44	3.384615	3.353095	3.563063	3.532292	3.558846
75%	4	4.25	4.25	4	4	4	4	4	4	4
max	4.9	5	4.97619	4.97561	4.857143	4.333333	4.103448	4.981481	4.659574	4.8

実験1とは異なり, 分散と順位の間に明確な関係は見られない。

期数による平均利得の変化



In [12]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest2/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [2, 8, 4, 10]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[2-1], color='red', linewidth=2, label="2 (ALLD)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (25%)")
plt.plot(t_list, average_matrix[8-1], color='blue', linewidth=2, label="8 (GT’)")
plt.plot(t_list, average_matrix[10-1], color='orange', linewidth=2, label="10 (TFT)")
plt.legend()
plt.show()

上位の戦略は、期数の短長にかかわらず、安定した平均利得をえている。
ALLDは特に短い期数のセッションでの平均利得が大きく、1位になった要因だと考えられる。

Case3: imperfect private monitoring（尾山ゼミの戦略のみ）

結果の生データ(csv)は contest3/data
戦略はuser_strategies
戦略のオートマトンはcontest3/automaton3.pdf



In [13]:

    
# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Iida_iprm, ImPrivStrategy, KatoStrategy, Self_Centered_private, GrimTrigger,
              MyStrategy, beeleb, OyamaImperfectPrivateMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_imperfect_private.Iida_iprm
2. mhanami_Imperfect_Private_Strategy.ImPrivStrategy
3. kato.KatoStrategy
4. ikegami_imperfect_private.Self_Centered_private
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaImperfectPrivateMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.476  2.318  2.954  3.56   4.004  3.984  3.414  3.465  3.333]
 [ 3.241  0.     2.406  3.338  3.626  3.974  3.977  3.582  3.507  3.36 ]
 [ 2.799  3.373  0.     2.264  3.404  3.963  3.823  3.155  3.538  2.849]
 [ 3.288  3.475  2.398  0.     3.68   3.84   3.76   3.472  3.267  3.19 ]
 [ 2.922  3.555  1.92   2.801  0.     3.904  3.823  3.516  3.283  3.479]
 [ 3.11   3.934  2.205  3.255  3.567  0.     3.998  3.687  3.429  3.656]
 [ 3.126  3.935  2.336  3.402  3.487  3.985  0.     3.674  3.503  3.679]
 [ 3.227  3.634  2.375  3.249  3.524  3.924  3.89   0.     3.407  3.415]
 [ 3.122  3.688  2.42   3.09   3.543  3.916  3.832  3.506  0.     3.39 ]
 [ 3.384  3.422  2.478  3.086  3.656  3.96   4.022  3.525  3.48   0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     3.172  1.946  2.491  3.394  4.003  3.886  3.075  2.939  3.125]
 [ 2.899  0.     2.21   3.203  3.526  3.959  3.964  3.363  3.047  3.141]
 [ 2.63   2.803  0.     2.049  3.392  3.699  3.253  2.733  2.991  2.542]
 [ 2.989  3.268  2.164  0.     3.554  3.772  3.596  3.183  2.792  2.924]
 [ 2.417  3.345  1.457  2.285  0.     3.862  3.71   3.313  2.621  3.345]
 [ 2.684  3.902  1.785  3.092  3.439  0.     3.999  3.609  2.814  3.597]
 [ 2.759  3.911  2.113  3.365  3.336  3.982  0.     3.535  3.017  3.629]
 [ 2.895  3.399  2.115  3.043  3.365  3.891  3.77   0.     2.909  3.217]
 [ 2.817  3.267  2.2    2.719  3.452  3.752  3.471  3.134  0.     3.048]
 [ 3.193  3.172  2.204  2.795  3.543  3.948  3.996  3.292  3.008  0.   ]]

Ranking:
1. "beeleb_Strategy.beeleb" -> セッションを重率1で平均: 3.458, ステージゲームを重率1で平均: 3.294
2. "yamagishi_impd.yamagishi" -> セッションを重率1で平均: 3.446, ステージゲームを重率1で平均: 3.239
3. "mhanami_Imperfect_Private_Strategy.ImPrivStrategy" -> セッションを重率1で平均: 3.446, ステージゲームを重率1で平均: 3.257
4. "gistfile1.MyStrategy" -> セッションを重率1で平均: 3.427, ステージゲームを重率1で平均: 3.213
5. "oyama.OyamaImperfectPrivateMonitoring" -> セッションを重率1で平均: 3.405, ステージゲームを重率1で平均: 3.178
6. "Iida_imperfect_private.Iida_iprm" -> セッションを重率1で平均: 3.390, ステージゲームを重率1で平均: 3.115
7. "ogawa.ogawa" -> セッションを重率1で平均: 3.390, ステージゲームを重率1で平均: 3.095
8. "ikegami_imperfect_private.Self_Centered_private" -> セッションを重率1で平均: 3.375, ステージゲームを重率1で平均: 3.138
9. "tsuyoshi.GrimTrigger" -> セッションを重率1で平均: 3.245, ステージゲームを重率1で平均: 2.928
10. "kato.KatoStrategy" -> セッションを重率1で平均: 3.241, ステージゲームを重率1で平均: 2.899

Summary

Str_numbers	Strategy name	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(stage based)	備考
Datetime	2015-12-05-00-14-04
Monitoring type	private
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	10
7	beeleb_Strategy.beeleb	3.458461687	1	3.293973027	1
10	yamagishi_impd.yamagishi	3.445953725	2	3.239141168	3	TFT
2	mhanami_Imperfect_Private_Strategy.ImPrivStrategy	3.445605137	3	3.256924154	2	2T2FT
6	gistfile1.MyStrategy	3.426852855	4	3.213336647	4	TFT'
8	oyama.OyamaImperfectPrivateMonitoring	3.405064024	5	3.178198469	5	TFT'
1	Iida_imperfect_private.Iida_iprm	3.389575102	6	3.114596015	7
9	ogawa.ogawa	3.389554064	7	3.095465398	8
4	ikegami_imperfect_private.Self_Centered_private	3.374520784	8	3.137879433	6	20%
5	tsuyoshi.GrimTrigger	3.244853934	9	2.928183251	9	TFT'
3	kato.KatoStrategy	3.240895862	10	2.899218475	10

戦略別, セッション平均利得の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%



In [14]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest3/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*(strategies-1), strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量

str number	1	2	3	4	5	6	7	8	9	10
rank	6	3	10	8	9	4	1	5	7	2
count	18000	18000	18000	18000	18000	18000	18000	18000	18000	18000
mean	3.389575	3.445605	3.240896	3.374521	3.244854	3.426853	3.458462	3.405064	3.389554	3.445954
std	0.767754	0.693915	0.811377	0.70892	0.901283	0.80159	0.729059	0.720129	0.714496	0.699362
min	1.230769	1.2	1.416667	1.333333	0.571429	0.615385	0.666667	0.571429	0.8	1.333333
25%	2.875	2.857143	2.526316	2.707154	2.695652	3	2.962963	2.84	2.792308	2.952381
50%	3.666667	3.911111	3.195387	3.666667	3.607143	3.947114	3.917526	3.758621	3.652618	3.793103
75%	4	4	4	4	4	4	4	4	4	4
max	4.8	4.4	4.8	4.857143	4.8	4.25	4.038462	4.8	4.428571	4.5

期数による平均利得の変化



In [15]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest3/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [2, 7, 4, 10]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[7-1], color='red', linewidth=2, label="7")
plt.plot(t_list, average_matrix[10-1], color='orange', linewidth=2, label="10 (TFT)")
plt.plot(t_list, average_matrix[2-1], color='blue', linewidth=2, label="2 (2T2FT)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (20%)")
plt.legend()
plt.show()

期数が長くなるに従って、協調がしづらくなっていることがわかる。TFT同士の対戦では、このようなことが一般に起こる（後述）

Case4: imperfect private monitoring（神取ゼミの戦略のみ）

結果の生データ(csv)は contest4/data
戦略は user_strategies
戦略のオートマトンは contest4/automaton4.pdf
神取ゼミの元実験と尾山ゼミでの再実験の比較（各対戦毎のセッション平均利得の違い） contest4/神取ゼミ実験_尾山ゼミ再実験比較.xlsx



In [16]:

    
# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24]
    
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 24 strategy functions below
--------------------------------------------------
1. kandori.Strategy1
2. kandori.Strategy2
3. kandori.Strategy3
4. kandori.Strategy4
5. kandori.Strategy5
6. kandori.Strategy6
7. kandori.Strategy7
8. kandori.Strategy8
9. kandori.Strategy9
10. kandori.Strategy10
11. kandori.Strategy11
12. kandori.Strategy12
13. kandori.Strategy13
14. kandori.Strategy14
15. kandori.Strategy15
16. kandori.Strategy16
17. kandori.Strategy17
18. kandori.Strategy18
19. kandori.Strategy19
20. kandori.Strategy20
21. kandori.Strategy21
22. kandori.Strategy22
23. kandori.Strategy23
24. kandori.Strategy24
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.703  3.614  3.793  3.834  3.909  3.041  3.817  2.363  3.03
   3.513  1.965  1.682  3.667  3.171  3.703  3.388  2.67   3.162  3.396
   3.69   3.774  4.     2.011]
 [ 3.533  0.     3.686  3.59   3.66   3.935  3.075  3.598  2.473  3.066
   3.64   1.645  1.577  3.718  3.124  3.783  3.335  2.484  3.218  3.414
   3.766  3.799  3.972  1.873]
 [ 3.538  3.777  0.     3.617  3.688  3.946  3.075  3.596  2.419  3.059
   3.613  1.707  1.651  3.719  3.17   3.777  3.29   2.652  3.212  3.436
   3.767  3.81   3.981  2.027]
 [ 3.156  3.161  3.161  0.     3.636  3.748  3.43   3.981  3.177  3.379
   3.895  1.902  1.671  3.501  2.671  3.161  3.768  3.17   3.654  2.874
   3.272  3.474  3.983  1.577]
 [ 2.397  3.332  3.203  4.065  0.     3.754  3.314  3.91   3.007  3.278
   3.836  1.875  1.241  3.365  2.747  3.332  3.615  1.892  3.434  3.025
   3.341  3.382  4.053  1.28 ]
 [ 3.076  3.599  3.498  3.921  3.649  0.     3.346  3.678  3.113  3.292
   3.825  1.911  1.284  3.633  2.869  3.599  3.596  2.328  3.575  3.174
   3.609  3.627  3.981  1.367]
 [ 2.855  3.146  3.079  4.021  3.748  3.782  0.     3.957  2.502  3.05
   3.715  2.044  2.073  3.458  2.948  3.146  3.379  2.808  3.148  3.038
   3.239  3.644  3.998  2.063]
 [ 3.17   3.096  3.028  3.991  3.655  3.627  3.416  0.     3.187  3.363
   3.871  2.018  1.535  3.278  2.69   3.096  3.755  2.682  3.656  2.858
   3.146  3.199  3.982  1.496]
 [ 2.515  2.761  2.66   4.073  3.802  3.938  2.87   4.078  0.     2.874
   3.702  2.125  2.278  3.435  2.886  2.761  3.426  2.341  2.798  2.82
   2.903  3.554  4.064  2.369]
 [ 2.839  3.135  3.062  4.027  3.773  3.764  3.067  3.948  2.503  0.     3.712
   2.03   2.052  3.427  2.95   3.135  3.364  2.783  3.133  3.034  3.21
   3.611  4.006  2.057]
 [ 3.131  3.1    3.048  4.009  3.564  3.819  3.334  3.978  2.997  3.285  0.
   1.96   1.972  3.539  2.716  3.1    3.626  2.826  3.497  2.862  3.211
   3.626  3.952  1.525]
 [ 2.889  3.347  3.242  4.056  4.07   3.471  2.785  3.801  2.218  2.816
   3.361  0.     2.359  3.25   3.326  3.347  2.865  2.781  2.684  3.322
   3.314  3.275  4.07   2.884]
 [ 3.448  3.601  3.491  3.479  4.124  4.059  2.846  3.682  2.264  2.875
   3.028  2.24   0.     3.655  3.518  3.601  2.865  2.989  2.722  3.574
   3.624  3.945  4.124  2.75 ]
 [ 3.321  3.687  3.589  3.825  3.693  3.929  3.233  3.609  2.86   3.186
   3.736  1.889  1.548  0.     3.065  3.687  3.476  2.574  3.447  3.305
   3.674  3.738  3.987  1.896]
 [ 3.111  3.578  3.428  3.704  3.885  3.891  3.034  3.609  2.52   3.019
   3.579  1.761  1.606  3.46   0.     3.578  3.206  2.153  3.079  3.327
   3.556  3.653  4.022  1.952]
 [ 3.533  3.783  3.686  3.59   3.66   3.935  3.075  3.598  2.473  3.066
   3.64   1.645  1.577  3.718  3.124  0.     3.335  2.484  3.218  3.414
   3.766  3.799  3.972  1.873]
 [ 3.256  3.248  3.184  3.962  3.682  3.705  3.235  3.963  2.856  3.199
   3.706  2.159  2.08   3.499  2.908  3.248  0.     2.986  3.368  3.038
   3.334  3.401  3.978  1.96 ]
 [ 3.489  3.69   3.605  3.86   3.939  4.001  3.026  3.584  2.284  3.027
   3.462  1.981  1.985  3.73   3.403  3.69   3.215  0.     3.104  3.525
   3.697  3.923  4.047  2.865]
 [ 3.074  3.241  3.2    3.987  3.631  3.829  3.091  3.982  2.46   3.07
   3.695  2.09   2.163  3.597  2.941  3.241  3.425  2.97   0.     3.057
   3.349  3.791  3.942  2.081]
 [ 3.324  3.696  3.58   3.624  3.787  3.921  3.059  3.583  2.5    3.041
   3.604  1.713  1.584  3.593  3.155  3.696  3.242  2.256  3.129  0.     3.684
   3.734  3.999  1.929]
 [ 3.484  3.764  3.677  3.672  3.673  3.939  3.146  3.612  2.557  3.108
   3.678  1.716  1.564  3.705  3.123  3.764  3.401  2.551  3.3    3.397  0.
   3.788  3.981  1.885]
 [ 3.454  3.765  3.634  3.868  3.657  3.927  3.298  3.547  2.947  3.249
   3.761  1.905  1.348  3.64   3.055  3.765  3.423  2.26   3.552  3.372
   3.734  0.     3.972  1.826]
 [ 2.873  3.548  3.442  3.991  3.639  3.835  3.426  3.973  3.175  3.371
   3.888  1.875  1.24   3.584  2.747  3.548  3.765  2.303  3.628  3.07
   3.558  3.578  0.     1.279]
 [ 3.314  3.652  3.358  3.342  4.149  4.069  2.751  3.41   2.316  2.774
   3.377  1.67   1.84   3.346  3.428  3.652  2.832  1.753  2.674  3.523
   3.578  3.689  4.149  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     3.652  3.533  3.575  3.9    3.908  2.762  3.428  2.42   2.771
   3.176  1.852  1.442  3.597  3.189  3.652  3.017  2.232  2.79   3.345
   3.64   3.775  4.019  2.362]
 [ 3.202  0.     3.641  3.351  3.587  3.915  2.869  2.897  2.652  2.869
   3.491  1.348  1.173  3.666  3.065  3.756  3.047  1.9    2.972  3.362
   3.743  3.775  3.963  2.017]
 [ 3.251  3.746  0.     3.379  3.625  3.921  2.86   2.902  2.592  2.848
   3.44   1.426  1.272  3.655  3.091  3.746  2.983  2.172  2.944  3.366
   3.729  3.785  3.97   2.193]
 [ 2.667  2.946  2.92   0.     3.56   3.621  3.355  3.903  3.456  3.295
   3.856  1.746  1.419  3.367  2.571  2.946  3.662  3.024  3.592  2.695
   3.089  3.417  3.979  1.897]
 [ 1.615  3.21   3.02   4.074  0.     3.645  3.211  3.485  3.225  3.174
   3.752  1.713  0.798  3.191  2.63   3.21   3.45   1.169  3.331  2.887
   3.205  3.28   4.061  1.428]
 [ 2.482  3.561  3.427  3.874  3.575  0.     3.234  3.101  3.379  3.176
   3.744  1.737  0.85   3.573  2.785  3.561  3.357  1.609  3.503  3.09
   3.57   3.598  3.974  1.532]
 [ 2.354  2.903  2.823  3.986  3.685  3.693  0.     3.613  2.683  2.854
   3.562  1.878  1.806  3.31   2.813  2.903  3.121  2.458  2.911  2.846
   3.026  3.593  3.985  2.232]
 [ 2.819  3.076  2.971  3.963  3.644  3.553  3.231  0.     3.345  3.182
   3.748  1.963  1.553  3.186  2.86   3.076  3.549  2.476  3.49   2.929
   3.109  3.285  3.982  2.404]
 [ 2.205  2.711  2.606  3.957  3.627  3.82   2.762  3.756  0.     2.756
   3.483  1.961  1.952  3.398  2.772  2.711  3.127  2.223  2.726  2.745
   2.879  3.667  3.966  2.327]
 [ 2.318  2.902  2.812  3.99   3.713  3.667  2.861  3.575  2.677  0.     3.563
   1.857  1.777  3.27   2.811  2.902  3.098  2.441  2.903  2.84   2.987
   3.555  3.996  2.226]
 [ 2.687  2.805  2.72   4.013  3.448  3.727  3.201  3.839  3.19   3.153  0.
   1.815  1.752  3.383  2.575  2.805  3.424  2.374  3.35   2.637  2.966
   3.59   3.932  1.714]
 [ 2.546  3.328  3.193  3.862  3.937  3.278  2.564  3.128  2.282  2.606
   3.094  0.     2.09   3.101  3.256  3.328  2.506  2.65   2.415  3.282
   3.273  3.19   3.937  3.207]
 [ 3.131  3.534  3.385  3.172  4.102  4.025  2.58   2.971  2.289  2.624
   2.672  2.09   0.     3.556  3.502  3.534  2.464  2.792  2.411  3.524
   3.543  3.963  4.102  3.224]
 [ 2.905  3.65   3.516  3.726  3.628  3.9    3.08   2.962  3.116  3.033
   3.628  1.672  1.16   0.     2.981  3.65   3.237  2.033  3.316  3.234
   3.639  3.711  3.979  2.067]
 [ 2.567  3.518  3.321  3.507  3.82   3.824  2.85   2.788  2.689  2.838
   3.425  1.511  1.186  3.351  0.     3.518  2.937  1.663  2.867  3.243
   3.473  3.608  3.992  2.132]
 [ 3.202  3.756  3.641  3.351  3.587  3.915  2.869  2.897  2.652  2.869
   3.491  1.348  1.173  3.666  3.065  0.     3.047  1.9    2.972  3.362
   3.743  3.775  3.963  2.017]
 [ 2.846  3.007  2.913  3.941  3.629  3.502  3.059  3.771  2.991  3.025
   3.54   2.044  1.89   3.303  2.816  3.007  0.     2.604  3.149  2.869
   3.124  3.252  3.969  2.216]
 [ 3.159  3.604  3.498  3.792  3.996  3.997  2.768  2.919  2.394  2.787
   3.193  1.749  1.667  3.635  3.408  3.604  2.825  0.     2.772  3.491
   3.615  3.938  4.062  3.182]
 [ 2.592  2.98   2.919  3.946  3.567  3.794  2.882  3.767  2.65   2.875
   3.537  1.96   1.921  3.487  2.82   2.98   3.178  2.592  0.     2.871
   3.14   3.764  3.933  2.253]
 [ 2.864  3.659  3.495  3.391  3.723  3.884  2.866  2.801  2.678  2.854
   3.444  1.439  1.176  3.505  3.067  3.659  2.96   1.7    2.908  0.     3.626
   3.707  3.981  2.084]
 [ 3.124  3.739  3.629  3.478  3.598  3.913  2.953  2.947  2.771  2.922
   3.533  1.432  1.168  3.653  3.037  3.739  3.132  1.976  3.087  3.329  0.
   3.768  3.969  2.03 ]
 [ 3.062  3.732  3.573  3.812  3.581  3.903  3.188  2.78   3.281  3.133
   3.697  1.721  0.887  3.569  2.97   3.732  3.2    1.549  3.485  3.307
   3.704  0.     3.964  1.959]
 [ 2.181  3.49   3.358  3.989  3.563  3.768  3.357  3.852  3.46   3.298
   3.842  1.713  0.798  3.499  2.628  3.49   3.672  1.569  3.58   2.955
   3.501  3.531  0.     1.427]
 [ 2.868  3.589  3.255  2.982  4.061  3.968  2.582  2.312  2.408  2.604
   3.167  1.322  1.288  3.238  3.308  3.589  2.54   1.415  2.498  3.444
   3.51   3.634  4.063  0.   ]]

Ranking:
1. "kandori.Strategy18" -> セッションを重率1で平均: 3.354, ステージゲームを重率1で平均: 3.220
2. "kandori.Strategy13" -> セッションを重率1で平均: 3.326, ステージゲームを重率1で平均: 3.182
3. "kandori.Strategy22" -> セッションを重率1で平均: 3.259, ステージゲームを重率1で平均: 3.121
4. "kandori.Strategy14" -> セッションを重率1で平均: 3.259, ステージゲームを重率1で平均: 3.123
5. "kandori.Strategy1" -> セッションを重率1で平均: 3.256, ステージゲームを重率1で平均: 3.132
6. "kandori.Strategy3" -> セッションを重率1で平均: 3.240, ステージゲームを重率1で平均: 3.082
7. "kandori.Strategy21" -> セッションを重率1で平均: 3.238, ステージゲームを重率1で平均: 3.084
8. "kandori.Strategy2" -> セッションを重率1で平均: 3.216, ステージゲームを重率1で平均: 3.055
9. "kandori.Strategy16" -> セッションを重率1で平均: 3.216, ステージゲームを重率1で平均: 3.055
10. "kandori.Strategy17" -> セッションを重率1で平均: 3.216, ステージゲームを重率1で平均: 3.064
11. "kandori.Strategy19" -> セッションを重率1で平均: 3.213, ステージゲームを重率1で平均: 3.061
12. "kandori.Strategy6" -> セッションを重率1で平均: 3.198, ステージゲームを重率1で平均: 3.056
13. "kandori.Strategy12" -> セッションを重率1で平均: 3.197, ステージゲームを重率1で平均: 3.046
14. "kandori.Strategy20" -> セッションを重率1で平均: 3.193, ステージゲームを重率1で平均: 3.020
15. "kandori.Strategy4" -> セッションを重率1で平均: 3.191, ステージゲームを重率1で平均: 3.086
16. "kandori.Strategy23" -> セッションを重率1で平均: 3.189, ステージゲームを重率1で平均: 3.066
17. "kandori.Strategy7" -> セッションを重率1で平均: 3.167, ステージゲームを重率1で平均: 3.002
18. "kandori.Strategy15" -> セッションを重率1で平均: 3.161, ステージゲームを重率1で平均: 2.984
19. "kandori.Strategy11" -> セッションを重率1で平均: 3.160, ステージゲームを重率1で平均: 3.004
20. "kandori.Strategy24" -> セッションを重率1で平均: 3.159, ステージゲームを重率1で平均: 2.941
21. "kandori.Strategy10" -> セッションを重率1で平均: 3.158, ステージゲームを重率1で平均: 2.989
22. "kandori.Strategy8" -> セッションを重率1で平均: 3.122, ステージゲームを重率1で平均: 3.104
23. "kandori.Strategy9" -> セッションを重率1で平均: 3.088, ステージゲームを重率1で平均: 2.963
24. "kandori.Strategy5" -> セッションを重率1で平均: 3.073, ステージゲームを重率1で平均: 2.903

Summary

Datetime	2015-12-05-01-07-37
Monitoring type	private
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	24
Str_numbers	Strategy name	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(trimmed)	備考
18	kandori.Strategy18	3.35352416	1	3.219810292	1	WSLS'
13	kandori.Strategy13	3.326308014	2	3.182248494	2	CCDDDD
22	kandori.Strategy22	3.259068663	3	3.121244482	4
14	kandori.Strategy14	3.258886509	4	3.122727237	3	WSLS'
1	kandori.Strategy1	3.256299103	5	3.132024724	5
3	kandori.Strategy3	3.240387724	6	3.082433491	6	WSLS'
21	kandori.Strategy21	3.238405281	7	3.083776638	7	WSLS'
2	kandori.Strategy2	3.215812884	8	3.054822228	9	WSLS
16	kandori.Strategy16	3.215812884	9	3.054822228	10	WSLS
17	kandori.Strategy17	3.215547504	10	3.063675088	8	TFT'
19	kandori.Strategy19	3.213334955	11	3.06115156	11	TFT
6	kandori.Strategy6	3.197763649	12	3.056192503	12	WSLS'
12	kandori.Strategy12	3.197073568	13	3.045809911	16
20	kandori.Strategy20	3.192768288	14	3.020367533	13	WSLS'
4	kandori.Strategy4	3.191465329	15	3.086214152	14
23	kandori.Strategy23	3.188569289	16	3.06617612	15
7	kandori.Strategy7	3.166979223	17	3.001625671	17	TFT'
15	kandori.Strategy15	3.161225612	18	2.983885545	19	WSLS'
11	kandori.Strategy11	3.159787981	19	3.004255063	18	TFT'
24	kandori.Strategy24	3.158548933	20	2.940998137	21
10	kandori.Strategy10	3.157508886	21	2.988733446	20	TFT'
8	kandori.Strategy8	3.121529725	22	3.104081314	22	HIST
9	kandori.Strategy9	3.088360193	23	2.962527525	24	STFT
5	kandori.Strategy5	3.072941197	24	2.902704555	23

CCDDDD: 最初2期はC, それ以降はDを出す戦略
STFT: 最初にDを出すTit for Tat
HIST: 過去n回以上シグナルBが出た場合はD, それ以外はCを出す戦略
WSLS: Win Stay Lose Shift. オートマトンで書くと
となる。
WSLS': WSLSに確率や状態を追加したもの.

神取ゼミの本実験と, 尾山ゼミでの再実験の比較

順位	本実験		再実験
順位	戦略	利得	戦略	利得
1	18	3.356	18	3.354
2	13	3.316	13	3.326
3	22	3.263	22	3.259
4	14	3.260	14	3.259
5	1	3.255	1	3.256
6	3	3.238	3	3.240
7	21	3.227	21	3.238
8	16	3.217	2	3.216
9	19	3.217	16	3.216
10	2	3.217	17	3.216
11	17	3.215	19	3.213
12	6	3.205	6	3.198
13	4	3.192	12	3.197
14	23	3.190	20	3.193
15	12	3.187	4	3.191
16	20	3.187	23	3.189
17	11	3.164	7	3.167
18	7	3.161	15	3.161
19	15	3.151	11	3.160
20	10	3.148	24	3.159
21	24	3.140	10	3.158
22	8	3.129	8	3.122
23	9	3.084	9	3.088
24	5	3.048	5	3.073

戦略別, セッション平均の分布



In [17]:

    
rounds = 1000 * 2
strategies = 24
max_ts = 100

# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*(strategies-1), strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(22, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.1, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量

str number	1	2	3	4	5	6	7	8	9	10	11	12
ranking	5	8	6	15	24	12	17	22	23	21	19	13
count	46000	46000	46000	46000	46000	46000	46000	46000	46000	46000	46000	46000
mean	3.256299	3.215813	3.240388	3.191465	3.072941	3.197764	3.166979	3.12153	3.08836	3.157509	3.159788	3.197074
std	0.837516	0.879935	0.850714	0.855328	0.974213	0.939815	0.765376	0.953146	0.818489	0.76951	0.845308	0.690475
min	0	0	0	0	0	0	0	0	1	0	0	0
25%	2.694303	2.705882	2.704918	2.658537	2.666667	2.864865	2.6	2.6	2.4	2.6	2.515152	2.7
50%	3.548387	3.541918	3.5625	3.466667	3.303571	3.52381	3.210526	3.416667	2.9375	3.195122	3.378078	3.194444
75%	4	3.964286	3.969697	4	4	4	3.947368	4	3.777778	3.923077	4	3.545455
max	4.923077	4.916667	4.909091	4.166667	4.25	4.777778	4.75	4.104895	5	4.75	4.333333	4.777778

str number	13	14	15	16	17	18	19	20	21	22	23	24
ranking	2	4	18	9	10	1	11	14	7	3	16	20
count	46000	46000	46000	46000	46000	46000	46000	46000	46000	46000	46000	46000
mean	3.326308	3.258887	3.161226	3.215813	3.215548	3.353524	3.213335	3.192768	3.238405	3.259069	3.188569	3.158549
std	0.714311	0.847242	0.874287	0.879935	0.753468	0.742637	0.748892	0.878785	0.870167	0.896169	0.963549	0.937062
min	0	0	0	0	0	0	0	0	0	0	0	1
25%	2.75	2.875	2.68265	2.705882	2.661578	2.892204	2.611111	2.692308	2.769231	2.95	2.846154	2.522727
50%	3.466667	3.55	3.375	3.541918	3.333333	3.571429	3.333333	3.482759	3.555556	3.587932	3.533333	3.325581
75%	3.97619	4	3.849486	3.964286	4	4	4	3.909091	4	3.965517	4	3.742424
max	4.875	4.916667	4.909091	4.916667	4.4	4.923077	4.666667	4.916667	4.916667	4.916667	4.2	5

期数による平均利得の変化



In [18]:

    
rounds = 1000 * 2
strategies = 24
max_ts = 100

# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [18, 13, 2, 19, 9]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[18-1], color='red', linewidth=2, label="18 (WSLS’)")
plt.plot(t_list, average_matrix[13-1], color='orange', linewidth=2, label="13 (CCDDDD)")
plt.plot(t_list, average_matrix[2-1], color='blue', linewidth=2, label="2 (WSLS)")
plt.plot(t_list, average_matrix[19-1], color='green', linewidth=2, label="19 (TFT)")
plt.plot(t_list, average_matrix[9-1], color='purple', linewidth=2, label="9 (STFT)")
plt.legend()
plt.show()

1位になったWSLS'（Strategy18）は、WSLSをよりALLDに強くしたもの（後述）。2位はALLD。
Strategy18:

トリム平均

セッション・ベース平均から、期数の短いセッション・長いセッションそれぞれ5%ずつを除外して平均を求める
※区間の端にタイがある場合は、重みを調整する（例: 48位: 1, 49位: 2, 50位: 2, 51位: 2, 52位: 3なら、49位〜51位の平均利得の和を1/3倍して計算する）



In [19]:

    
def trim_mean(ts_length, aves, width):
    size = ts_length.size
    hist = {}
    for t in ts_length:
        hist[t] = hist.get(t, 0) + 1

    lower_b = size * (1-width) / 2
    upper_b = size * (1 - (1-width)/2)

    s = 0
    total = 0
    for ts, num in sorted(hist.items()):
        old_s = s
        s += num
        if old_s <= lower_b < s:
            total += (s-lower_b) * aves[ts-1]

        elif old_s <= upper_b < s:
            total += (upper_b-old_s+1) * aves[ts-1]

        elif lower_b <= s <= upper_b:
            total += num * aves[ts-1]

        elif s > upper_b:
            break

    return total / (size * width)

rounds = 1000 * 2
strategies = 24
max_ts = ts_length.max()
    
# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜max期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)
for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

for i in range(strategies):
    print(trim_mean(ts_length, average_matrix[i], 0.9))









    



3.25028511707
3.21096357851
3.23663967616
3.18188857519
3.06308019426
3.19319331055
3.15884430904
3.09671877158
3.03083932669
3.14890921282
3.15402173936
3.1780469466
3.32768294501
3.25616068341
3.15145888378
3.21096357851
3.21302083987
3.35422360163
3.20715433858
3.18669779767
3.23519375673
3.25493501108
3.18058918396
3.10146601311

Str_numbers	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(stage based)	Average(90% trimmed)	Rank(trimmed)	備考
18	3.35352416	1	3.219810292	1	3.354223602	1	WSLS'
13	3.326308014	2	3.182248494	2	3.327682945	2	CCDDDD
22	3.259068663	3	3.121244482	5	3.254935011	4
14	3.258886509	4	3.122727237	4	3.256160683	3	WSLS'
1	3.256299103	5	3.132024724	3	3.250285117	5
3	3.240387724	6	3.082433491	9	3.236639676	6	WSLS'
21	3.238405281	7	3.083776638	8	3.235193757	7	WSLS'
2	3.215812884	8	3.054822228	14	3.210963579	9	WSLS
16	3.215812884	9	3.054822228	15	3.210963579	10	WSLS
17	3.215547504	10	3.063675088	11	3.21302084	8	TFT'
19	3.213334955	11	3.06115156	12	3.207154339	11	TFT
6	3.197763649	12	3.056192503	13	3.193193311	12	WSLS'
12	3.197073568	13	3.045809911	16	3.178046947	16
20	3.192768288	14	3.020367533	17	3.186697798	13	WSLS'
4	3.191465329	15	3.086214152	7	3.181888575	14
23	3.188569289	16	3.06617612	10	3.180589184	15
7	3.166979223	17	3.001625671	19	3.158844309	17	TFT'
15	3.161225612	18	2.983885545	21	3.151458884	19	WSLS'
11	3.159787981	19	3.004255063	18	3.154021739	18	TFT'
24	3.158548933	20	2.940998137	23	3.101466013	21
10	3.157508886	21	2.988733446	20	3.148909213	20	TFT'
8	3.121529725	22	3.104081314	6	3.096718772	22	HIST
9	3.088360193	23	2.962527525	22	3.030839327	24	STFT
5	3.072941197	24	2.902704555	24	3.063080194	23

ほぼセッションベース平均と同じ。

Case5: imperfect private monitoring（尾山ゼミ+神取ゼミの戦略）

結果の生データ(csv)は contest5/data
戦略は user_strategies
戦略のオートマトンは contest5/automaton5.pdf



In [20]:

    
# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24, 
                    Iida_iprm, KatoStrategy, Self_Centered_private, ImPrivStrategy,
                    GrimTrigger, MyStrategy, beeleb, OyamaImperfectPrivateMonitoring, ogawa, yamagishi]
    
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 34 strategy functions below
--------------------------------------------------
1. kandori.Strategy1
2. kandori.Strategy2
3. kandori.Strategy3
4. kandori.Strategy4
5. kandori.Strategy5
6. kandori.Strategy6
7. kandori.Strategy7
8. kandori.Strategy8
9. kandori.Strategy9
10. kandori.Strategy10
11. kandori.Strategy11
12. kandori.Strategy12
13. kandori.Strategy13
14. kandori.Strategy14
15. kandori.Strategy15
16. kandori.Strategy16
17. kandori.Strategy17
18. kandori.Strategy18
19. kandori.Strategy19
20. kandori.Strategy20
21. kandori.Strategy21
22. kandori.Strategy22
23. kandori.Strategy23
24. kandori.Strategy24
25. Iida_imperfect_private.Iida_iprm
26. kato.KatoStrategy
27. ikegami_imperfect_private.Self_Centered_private
28. mhanami_Imperfect_Private_Strategy.ImPrivStrategy
29. tsuyoshi.GrimTrigger
30. gistfile1.MyStrategy
31. beeleb_Strategy.beeleb
32. oyama.OyamaImperfectPrivateMonitoring
33. ogawa.ogawa
34. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.703  3.614 ...,  3.45   3.291  3.162]
 [ 3.533  0.     3.686 ...,  3.388  3.225  3.218]
 [ 3.538  3.777  0.    ...,  3.365  3.208  3.212]
 ..., 
 [ 3.249  3.207  3.149 ...,  0.     3.407  3.415]
 [ 3.08   3.169  3.079 ...,  3.506  0.     3.39 ]
 [ 3.074  3.241  3.2   ...,  3.525  3.48   0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     3.652  3.533 ...,  3.091  2.788  2.79 ]
 [ 3.202  0.     3.641 ...,  3.082  2.566  2.972]
 [ 3.251  3.746  0.    ...,  3.05   2.572  2.944]
 ..., 
 [ 2.834  2.976  2.886 ...,  0.     2.909  3.217]
 [ 2.711  3.082  2.966 ...,  3.134  0.     3.048]
 [ 2.592  2.98   2.919 ...,  3.292  3.008  0.   ]]

Ranking:
1. "ikegami_imperfect_private.Self_Centered_private" -> セッションを重率1で平均: 3.368, ステージゲームを重率1で平均: 3.220
2. "mhanami_Imperfect_Private_Strategy.ImPrivStrategy" -> セッションを重率1で平均: 3.349, ステージゲームを重率1で平均: 3.216
3. "Iida_imperfect_private.Iida_iprm" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.153
4. "kandori.Strategy18" -> セッションを重率1で平均: 3.292, ステージゲームを重率1で平均: 3.085
5. "kandori.Strategy17" -> セッションを重率1で平均: 3.283, ステージゲームを重率1で平均: 3.111
6. "kandori.Strategy4" -> セッションを重率1で平均: 3.283, ステージゲームを重率1で平均: 3.158
7. "kandori.Strategy19" -> セッションを重率1で平均: 3.277, ステージゲームを重率1で平均: 3.107
8. "yamagishi_impd.yamagishi" -> セッションを重率1で平均: 3.277, ステージゲームを重率1で平均: 3.107
9. "gistfile1.MyStrategy" -> セッションを重率1で平均: 3.267, ステージゲームを重率1で平均: 3.116
10. "tsuyoshi.GrimTrigger" -> セッションを重率1で平均: 3.264, ステージゲームを重率1で平均: 3.085
11. "kandori.Strategy1" -> セッションを重率1で平均: 3.262, ステージゲームを重率1で平均: 3.077
12. "kandori.Strategy23" -> セッションを重率1で平均: 3.262, ステージゲームを重率1で平均: 3.107
13. "kandori.Strategy11" -> セッションを重率1で平均: 3.260, ステージゲームを重率1で平均: 3.099
14. "oyama.OyamaImperfectPrivateMonitoring" -> セッションを重率1で平均: 3.256, ステージゲームを重率1で平均: 3.087
15. "kandori.Strategy14" -> セッションを重率1で平均: 3.253, ステージゲームを重率1で平均: 3.058
16. "ogawa.ogawa" -> セッションを重率1で平均: 3.248, ステージゲームを重率1で平均: 3.083
17. "beeleb_Strategy.beeleb" -> セッションを重率1で平均: 3.246, ステージゲームを重率1で平均: 3.117
18. "kandori.Strategy8" -> セッションを重率1で平均: 3.240, ステージゲームを重率1で平均: 3.180
19. "kandori.Strategy22" -> セッションを重率1で平均: 3.231, ステージゲームを重率1で平均: 3.034
20. "kandori.Strategy6" -> セッションを重率1で平均: 3.225, ステージゲームを重率1で平均: 3.030
21. "kandori.Strategy7" -> セッションを重率1で平均: 3.225, ステージゲームを重率1で平均: 3.038
22. "kandori.Strategy13" -> セッションを重率1で平均: 3.224, ステージゲームを重率1で平均: 3.020
23. "kandori.Strategy21" -> セッションを重率1で平均: 3.216, ステージゲームを重率1で平均: 3.000
24. "kandori.Strategy10" -> セッションを重率1で平均: 3.214, ステージゲームを重率1で平均: 3.022
25. "kandori.Strategy3" -> セッションを重率1で平均: 3.207, ステージゲームを重率1で平均: 2.985
26. "kato.KatoStrategy" -> セッションを重率1で平均: 3.194, ステージゲームを重率1で平均: 3.033
27. "kandori.Strategy16" -> セッションを重率1で平均: 3.187, ステージゲームを重率1で平均: 2.962
28. "kandori.Strategy2" -> セッションを重率1で平均: 3.187, ステージゲームを重率1で平均: 2.962
29. "kandori.Strategy12" -> セッションを重率1で平均: 3.182, ステージゲームを重率1で平均: 2.985
30. "kandori.Strategy5" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.947
31. "kandori.Strategy20" -> セッションを重率1で平均: 3.149, ステージゲームを重率1で平均: 2.918
32. "kandori.Strategy9" -> セッションを重率1で平均: 3.132, ステージゲームを重率1で平均: 2.986
33. "kandori.Strategy15" -> セッションを重率1で平均: 3.118, ステージゲームを重率1で平均: 2.887
34. "kandori.Strategy24" -> セッションを重率1で平均: 3.018, ステージゲームを重率1で平均: 2.755

Summary

Str_numbers	Strategy name	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(stage based)	備考
Datetime	2015-11-30-18-01-45
Monitoring type	private
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	34
27	ikegami_imperfect_private.Self_Centered_private	3.36832832	1	3.220361024	1	20%
28	mhanami_Imperfect_Private_Strategy.ImPrivStrategy	3.348553889	2	3.216393297	2	2T2FT
25	Iida_imperfect_private.Iida_iprm	3.32968985	3	3.15335652	5
18	kandori.Strategy18	3.292214224	4	3.084777652	15	WSLS'
17	kandori.Strategy17	3.283194443	5	3.11118305	8	TFT'
4	kandori.Strategy4	3.282928401	6	3.158213803	4
19	kandori.Strategy19	3.277143381	7	3.106785994	10	TFT
34	yamagishi_impd.yamagishi	3.277143381	8	3.106785994	11	TFT
30	gistfile1.MyStrategy	3.266900169	9	3.116454907	7	TFT'
29	tsuyoshi.GrimTrigger	3.26353101	10	3.085325036	14	TFT'
1	kandori.Strategy1	3.261856949	11	3.076891541	17
23	kandori.Strategy23	3.261667595	12	3.107009651	9
11	kandori.Strategy11	3.260228791	13	3.098884204	12	TFT'
32	oyama.OyamaImperfectPrivateMonitoring	3.255871528	14	3.087366082	13	TFT'
14	kandori.Strategy14	3.25337761	15	3.058440504	18	WSLS'
33	ogawa.ogawa	3.247716335	16	3.083111982	16
31	beeleb_Strategy.beeleb	3.245998144	17	3.117079764	6
8	kandori.Strategy8	3.239536248	18	3.180036763	3	HIST
22	kandori.Strategy22	3.231422435	19	3.033872324	20
6	kandori.Strategy6	3.225318551	20	3.030270289	22	WSLS'
7	kandori.Strategy7	3.225300065	21	3.037891239	19	TFT'
13	kandori.Strategy13	3.223658906	22	3.020115324	24	CCDDDD
21	kandori.Strategy21	3.216136749	23	2.999892552	25	WSLS'
10	kandori.Strategy10	3.214150715	24	3.022255517	23	TFT'
3	kandori.Strategy3	3.207011964	25	2.984919502	27	WSLS'
26	kato.KatoStrategy	3.193996582	26	3.032555744	21
16	kandori.Strategy16	3.186837934	27	2.962482753	29	WSLS
2	kandori.Strategy2	3.186837934	28	2.962482753	30	WSLS
12	kandori.Strategy12	3.182024045	29	2.98471429	28
5	kandori.Strategy5	3.151694623	30	2.947040253	31
20	kandori.Strategy20	3.149316992	31	2.918115136	32	WSLS'
9	kandori.Strategy9	3.131695859	32	2.986149848	26	STFT
15	kandori.Strategy15	3.118259636	33	2.886726561	33	WSLS'
24	kandori.Strategy24	3.017841689	34	2.754503121	34

戦略別, セッション平均の分布



In [21]:

    
rounds = 1000 * 2
strategies = 34
max_ts = 100

# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*(strategies-1), strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(28, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.1, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量

str number	1	2	3	4	5	6	7	8	9	10	11	12
rank	11	28	25	6	30	20	21	18	32	24	13	29
count	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000
mean	3.261857	3.186838	3.207012	3.282928	3.151695	3.225319	3.2253	3.239536	3.131696	3.214151	3.260229	3.182024
std	0.820985	0.901844	0.871601	0.838379	0.971843	0.941597	0.764253	0.904268	0.847703	0.771735	0.815696	0.743275
min	0	0	0	0	0	0	0	0	1	0	0	0
25%	2.67681	2.627406	2.638741	2.736842	2.7	2.863481	2.650262	2.701754	2.407407	2.638889	2.666667	2.6
50%	3.535401	3.514286	3.52	3.6	3.375	3.56697	3.309091	3.578947	3	3.285714	3.51835	3.16
75%	4	4	4	4	4	4	4	4	3.894737	4	4	3.641026
max	4.923077	4.916667	4.909091	4.166667	4.25	4.777778	4.75	4.111111	5	4.75	4.333333	4.818182

str number	13	14	15	16	17	18	19	20	21	22	23	24
rank	22	15	33	27	5	4	7	31	23	19	12	34
count	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000
mean	3.223659	3.253378	3.11826	3.186838	3.283194	3.292214	3.277143	3.149317	3.216137	3.231422	3.261668	3.017842
std	0.734062	0.861139	0.900098	0.901844	0.741322	0.766913	0.738413	0.90132	0.886843	0.927439	0.946034	1.013924
min	0	0	0	0	0	0	0	0	0	0	0	1
25%	2.583333	2.833333	2.585366	2.627406	2.714286	2.72093	2.681818	2.6	2.703019	2.861538	2.918919	2.333333
50%	3.333333	3.555556	3.333333	3.514286	3.457143	3.520833	3.454545	3.416667	3.538462	3.578947	3.627451	3.153846
75%	3.85	4	3.857143	4	4	4	4	3.916667	4	4	4	3.666667
max	4.894737	4.916667	4.909091	4.916667	4.4	4.923077	4.666667	4.916667	4.923077	4.933333	4.2	5

str number	25	26	27	28	29	30	31	32	33	34
rank	3	26	1	2	10	9	17	14	16	8
count	66000	66000	66000	66000	66000	66000	66000	66000	66000	66000
mean	3.32969	3.193997	3.368328	3.348554	3.263531	3.2669	3.245998	3.255872	3.247716	3.277143
std	0.746131	0.760182	0.690123	0.700335	0.857724	0.879561	0.856646	0.793313	0.779358	0.738413
min	0	0	0	0	0	0	0	0	0	0
25%	2.9	2.577778	2.829268	2.888889	2.857143	2.75	2.608696	2.692308	2.680851	2.681818
50%	3.458333	3.214286	3.531915	3.507692	3.555556	3.604167	3.58209	3.47619	3.41791	3.454545
75%	4	3.794118	4	4	4	4	4	4	4	4
max	4.9	4.8	4.923077	4.4	4.923077	4.25	4.1	4.9	4.5	4.666667

期数による平均利得の変化



In [22]:

    
rounds = 1000 * 2
strategies = 34
max_ts = 100

# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]

    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [27, 28, 18, 13, 9, 8]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[27-1], color='red', linewidth=2, label="27 (20%)")
plt.plot(t_list, average_matrix[28-1], color='blue', linewidth=2, label="28 (2T2FT)")
plt.plot(t_list, average_matrix[19-1], color='magenta', linewidth=2, label="19 (TFT)")
plt.plot(t_list, average_matrix[18-1], color='green', linewidth=2, label="18 (WSLS’)")
plt.plot(t_list, average_matrix[13-1], color='purple', linewidth=2, label="13 (CCDDDD)")
plt.plot(t_list, average_matrix[9-1], color='brown', linewidth=2, label="9 (STFT)")
plt.plot(t_list, average_matrix[8-1], color='orange', linewidth=2, label="8 (HIST)")

plt.legend()
plt.show()

トリム平均

セッション・ベース平均から、期数の短いセッション・長いセッションそれぞれ5%ずつを除外して平均を求める



In [23]:

    
def trim_mean(ts_length, aves, width):
    size = ts_length.size
    hist = {}
    for t in ts_length:
        hist[t] = hist.get(t, 0) + 1

    lower_b = size * (1-width) / 2
    upper_b = size * (1 - (1-width)/2)

    s = 0
    total = 0
    for ts, num in sorted(hist.items()):
        old_s = s
        s += num
        if old_s <= lower_b < s:
            total += (s-lower_b) * aves[ts-1]

        elif old_s <= upper_b < s:
            total += (upper_b-old_s+1) * aves[ts-1]

        elif lower_b <= s <= upper_b:
            total += num * aves[ts-1]

        elif s > upper_b:
            break

    return total / (size * width)


rounds = 1000 * 2
strategies = 34
max_ts = ts_length.max()
    
# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜max期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)
for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

for i in range(strategies):
    print(trim_mean(ts_length, average_matrix[i], 0.9))









    



3.25836725707
3.18221523428
3.20307270078
3.27601772122
3.14555469052
3.22311223864
3.21824876442
3.2229728734
3.07622321391
3.20662985621
3.25599931942
3.16186412531
3.21869016874
3.25197005584
3.10642827444
3.18221523428
3.28166651639
3.29192924383
3.2717703079
3.14148753929
3.21351405913
3.22710146175
3.25718310891
2.95367429804
3.33064159671
3.17322680745
3.36569290401
3.34637606734
3.26061739471
3.261474645
3.23668854089
3.25168590556
3.24281557534
3.2717703079

Str_numbers	Avarage(session based)	Rank(session based)	Avarage(stage based)	Rank(stage based)	Average(90% trimmed)	Rank(trimmed)	備考
27	3.368	1	3.220	1	3.366	1	20%
28	3.349	2	3.216	2	3.346	2	2T2FT
25	3.330	3	3.153	5	3.331	3
18	3.292	4	3.085	15	3.292	4	WSLS'
17	3.283	5	3.111	8	3.282	5	TFT'
4	3.283	6	3.158	4	3.276	6
19	3.277	7	3.107	10	3.272	8	TFT
34	3.277	8	3.107	11	3.272	7	TFT
30	3.267	9	3.116	7	3.261	9	2TFT'
29	3.264	10	3.085	14	3.261	10	TFT'
1	3.262	11	3.077	17	3.258	11
23	3.262	12	3.107	9	3.257	12
11	3.260	13	3.099	12	3.256	13	TFT'
32	3.256	14	3.087	13	3.252	15	TFT'
14	3.253	15	3.058	18	3.252	14	WSLS'
33	3.248	16	3.083	16	3.243	16
31	3.246	17	3.117	6	3.237	17
8	3.240	18	3.180	3	3.223	20	HIST
22	3.231	19	3.034	20	3.227	18
6	3.225	20	3.030	22	3.223	19	WSLS'
7	3.225	21	3.038	19	3.218	22	TFT'
13	3.224	22	3.020	24	3.219	21	CCDDDD
21	3.216	23	3.000	25	3.214	23	WSLS'
10	3.214	24	3.022	23	3.207	24	TFT'
3	3.207	25	2.985	27	3.203	25	WSLS'
26	3.194	26	3.033	21	3.173	28
16	3.187	27	2.962	29	3.182	27	WSLS
2	3.187	28	2.962	30	3.182	26	WSLS
12	3.182	29	2.985	28	3.162	29
5	3.152	30	2.947	31	3.146	30
20	3.149	31	2.918	32	3.141	31	WSLS'
9	3.132	32	2.986	26	3.076	33	STFT
15	3.118	33	2.887	33	3.106	32	WSLS'
24	3.018	34	2.755	34	2.954	34

セッション平均とほぼ同じ。

検証

検証1 TFT, WSLS, ALLDの関係

不完全私的観測の実験ではWSLSとTFTに類似した戦略が多く見られた。そこでこの2つとAllDの計3タイプの戦略で実験を行い、実験4, 5の結果を説明することを考える。

TFT×2, WSLS×2

まず、TFTを2つ、WSLSを2つの計4戦略でimperfect private monitoringの実験をする。



In [24]:

    
class TFT(object):
    def __init__(self, random_state=None):
        if random_state is None:
            random_state = np.random.RandomState()
        self.random_state = random_state
        self.signal = 0

    def play(self):
        return self.signal

    def get_signal(self, signal):
        self.signal = signal


class WSLS(object):
    def __init__(self, random_state=None):
        if random_state is None:
            random_state = np.random.RandomState()
        self.random_state = random_state
        self.my_action = 0
        self.signal = 0

    def play(self):
        if self.signal == 1:
            self.my_action = 1 - self.my_action
            return self.my_action
        else:
            return self.my_action

    def get_signal(self, signal):
        self.signal = signal
        

# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [TFT, TFT, WSLS, WSLS]

game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 4 strategy functions below
--------------------------------------------------
1. __main__.TFT
2. __main__.TFT
3. __main__.WSLS
4. __main__.WSLS
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.225  3.241  3.241]
 [ 3.225  0.     3.241  3.241]
 [ 3.218  3.218  0.     3.783]
 [ 3.218  3.218  3.783  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     2.965  2.98   2.98 ]
 [ 2.965  0.     2.98   2.98 ]
 [ 2.972  2.972  0.     3.756]
 [ 2.972  2.972  3.756  0.   ]]

Ranking:
1. "__main__.WSLS" -> セッションを重率1で平均: 3.406, ステージゲームを重率1で平均: 3.233
2. "__main__.WSLS" -> セッションを重率1で平均: 3.406, ステージゲームを重率1で平均: 3.233
3. "__main__.TFT" -> セッションを重率1で平均: 3.236, ステージゲームを重率1で平均: 2.975
4. "__main__.TFT" -> セッションを重率1で平均: 3.236, ステージゲームを重率1で平均: 2.975

スコアテーブル（セッション平均）

	TFT	TFT	WSLS	WSLS
TFT	0	3.225	3.241	3.241
TFT	3.225	0	3.241	3.241
WSLS	3.218	3.218	0	3.783
WSLS	3.218	3.218	3.783	0

WSLS: セッション平均: 3.406
TFT: セッション平均: 3.236

となって、WSLSの方が平均利得が高くなる。なぜ？

TFT:
WSLS:

(∵) WSLS, TFTともに、誤ったシグナルが伝わらない限りは協調を続ける。対戦を、

TFT vs TFT
WSLS vs TFT
WSLS vs WSLS

の3つに分類する。

TFT vs TFT
両方同時に間違ったシグナルが出た場合、次期以降は（次に誤ったシグナルが出るまで）両者ともずっとDを出し続ける。
片方だけに間違ったシグナルが出た場合、次期以降は両者が交互にCとDを出す。
WSLS vs TFT
両方同時に誤ったシグナルが出た場合、次期以降、TFTはD→D→C→D→D→C→……、WSLSはD→C→D→D→C→D→…… という均衡になる。
片方だけに間違ったシグナルが出た場合も、同様のパターンに落ち着く。
WSLS vs WSLS
両方同時に間違ったシグナルが出た場合、双方1回裏切りの後、再び協調に戻る。
片方だけに間違ったシグナルが出た場合は、(C ,D)→(D, D)→(C, C)となって、ふたたび協調に戻る。

つまり、 WSLS vs WSLSはシグナルの間違いに強く同戦略同士での協調がしやすい ため、その両者の対戦のスコアが平均利得を押し上げたと考えられる。実際、スコアテーブルを見れば、3の対戦だけが突出して平均利得が高くなっている（1と2のパターンの対戦スコアはほぼ同じ）

TFT×2, WSLS×2, ALLD

次に、上の実験にALLDを加え、TFT2つ、WSLS2つ、ALLD1つの計5戦略でimperfect private monitoringの実験をする。



In [25]:

    
class ALLD(object):
    def __init__(self, random_state=None):
        if random_state is None:
            random_state = np.random.RandomState()
        self.random_state = random_state

    def play(self):
        return 1

    def get_signal(self, signal):
        pass

    
strategies = [TFT, TFT, WSLS, WSLS, ALLD]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 5 strategy functions below
--------------------------------------------------
1. __main__.TFT
2. __main__.TFT
3. __main__.WSLS
4. __main__.WSLS
5. __main__.ALLD
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.225  3.241  3.241  1.625]
 [ 3.225  0.     3.241  3.241  1.625]
 [ 3.218  3.218  0.     3.783  0.932]
 [ 3.218  3.218  3.783  0.     0.932]
 [ 2.562  2.562  3.602  3.602  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     2.965  2.98   2.98   1.78 ]
 [ 2.965  0.     2.98   2.98   1.78 ]
 [ 2.972  2.972  0.     3.756  0.984]
 [ 2.972  2.972  3.756  0.     0.984]
 [ 2.33   2.33   3.524  3.524  0.   ]]

Ranking:
1. "__main__.ALLD" -> セッションを重率1で平均: 3.082, ステージゲームを重率1で平均: 2.927
2. "__main__.TFT" -> セッションを重率1で平均: 2.833, ステージゲームを重率1で平均: 2.676
3. "__main__.TFT" -> セッションを重率1で平均: 2.833, ステージゲームを重率1で平均: 2.676
4. "__main__.WSLS" -> セッションを重率1で平均: 2.788, ステージゲームを重率1で平均: 2.671
5. "__main__.WSLS" -> セッションを重率1で平均: 2.788, ステージゲームを重率1で平均: 2.671

スコアテーブル:

	TFT	TFT	WSLS	WSLS	ALLD
TFT	0	3.225	3.241	3.241	1.625
TFT	3.225	0	3.241	3.241	1.625
WSLS	3.218	3.218	0	3.783	0.932
WSLS	3.218	3.218	3.783	0	0.932
ALLD	2.562	2.562	3.602	3.602	0

ALLD: セッション平均 3.082
TFT: セッション平均 2.833
WSLS: セッション平均 2.788

TFT, WSLSとALLDの対戦を考える。

TFT vs ALLD
誤ったシグナルが出されないかぎり、TFTはC→D→D→D→……となる。
誤ったシグナル（良いシグナル）がTFTに対して出た場合、TFTは1期協調した後、再び攻撃に戻る。結果、TFTの方がALLDよりもいくらか利得が少なくなる。
WSLS vs ALLD
WSLSは誤ったシグナルが出ないかぎり、ずっとC→D→C→D→……を繰り返す。したがって、WSLSはALLDに大きく利得を吸い取られる。

ALLDがWSLSの大きな弱点となっているため、WSLSとTFTだけのケースに比べ、上のゲームでは相対的にTFTの順位が高くなっている。

以上2つの結果を踏まえて、実験4と5の結果を解釈する。

実験4

実験4では、24戦略の内、TFTに類似した戦略が6、WSLSに類似した戦略が9、ALLDに類似した戦略が1つあった。
結果は1位がWSLSタイプ、2位がALLDタイプの戦略で、全体的にWSLSは高利得、TFTは低利得となった。それぞれのタイプの戦略で平均利得を集計すると、

戦略のタイプ	平均利得（セッション平均）
WSLS	3.230509666
TFT	3.166907904
ALLD	3.326308014

これを再現するためにTFT×6、WSLS×9、ALLD×1で実験をしてみると



In [26]:

    
strategies = [TFT, TFT, TFT, TFT, TFT, TFT, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, ALLD]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 16 strategy functions below
--------------------------------------------------
1. __main__.TFT
2. __main__.TFT
3. __main__.TFT
4. __main__.TFT
5. __main__.TFT
6. __main__.TFT
7. __main__.WSLS
8. __main__.WSLS
9. __main__.WSLS
10. __main__.WSLS
11. __main__.WSLS
12. __main__.WSLS
13. __main__.WSLS
14. __main__.WSLS
15. __main__.WSLS
16. __main__.ALLD
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.225  3.225  3.225  3.225  3.225  3.241  3.241  3.241  3.241
   3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.225  0.     3.225  3.225  3.225  3.225  3.241  3.241  3.241  3.241
   3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.225  3.225  0.     3.225  3.225  3.225  3.241  3.241  3.241  3.241
   3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.225  3.225  3.225  0.     3.225  3.225  3.241  3.241  3.241  3.241
   3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.225  3.225  3.225  3.225  0.     3.225  3.241  3.241  3.241  3.241
   3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.225  3.225  3.225  3.225  3.225  0.     3.241  3.241  3.241  3.241
   3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  0.     3.783  3.783  3.783
   3.783  3.783  3.783  3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  0.     3.783  3.783
   3.783  3.783  3.783  3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  0.     3.783
   3.783  3.783  3.783  3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  3.783  0.     3.783
   3.783  3.783  3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  3.783  3.783  0.
   3.783  3.783  3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  3.783  3.783
   3.783  0.     3.783  3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  3.783  3.783
   3.783  3.783  0.     3.783  3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  3.783  3.783
   3.783  3.783  3.783  0.     3.783  0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.783  3.783  3.783  3.783
   3.783  3.783  3.783  3.783  0.     0.932]
 [ 2.562  2.562  2.562  2.562  2.562  2.562  3.602  3.602  3.602  3.602
   3.602  3.602  3.602  3.602  3.602  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     2.965  2.965  2.965  2.965  2.965  2.98   2.98   2.98   2.98
   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.965  0.     2.965  2.965  2.965  2.965  2.98   2.98   2.98   2.98
   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.965  2.965  0.     2.965  2.965  2.965  2.98   2.98   2.98   2.98
   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.965  2.965  2.965  0.     2.965  2.965  2.98   2.98   2.98   2.98
   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.965  2.965  2.965  2.965  0.     2.965  2.98   2.98   2.98   2.98
   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  0.     2.98   2.98   2.98   2.98
   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  0.     3.756  3.756  3.756
   3.756  3.756  3.756  3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  0.     3.756  3.756
   3.756  3.756  3.756  3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  0.     3.756
   3.756  3.756  3.756  3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  3.756  0.     3.756
   3.756  3.756  3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  3.756  3.756  0.
   3.756  3.756  3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  3.756  3.756
   3.756  0.     3.756  3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  3.756  3.756
   3.756  3.756  0.     3.756  3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  3.756  3.756
   3.756  3.756  3.756  0.     3.756  0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  3.756  3.756  3.756  3.756
   3.756  3.756  3.756  3.756  0.     0.984]
 [ 2.33   2.33   2.33   2.33   2.33   2.33   3.524  3.524  3.524  3.524
   3.524  3.524  3.524  3.524  3.524  0.   ]]

Ranking:
1. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
2. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
3. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
4. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
5. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
6. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
7. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
8. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
9. "__main__.WSLS" -> セッションを重率1で平均: 3.367, ステージゲームを重率1で平均: 3.257
10. "__main__.ALLD" -> セッションを重率1で平均: 3.186, ステージゲームを重率1で平均: 3.046
11. "__main__.TFT" -> セッションを重率1で平均: 3.128, ステージゲームを重率1で平均: 2.895
12. "__main__.TFT" -> セッションを重率1で平均: 3.128, ステージゲームを重率1で平均: 2.895
13. "__main__.TFT" -> セッションを重率1で平均: 3.128, ステージゲームを重率1で平均: 2.895
14. "__main__.TFT" -> セッションを重率1で平均: 3.128, ステージゲームを重率1で平均: 2.895
15. "__main__.TFT" -> セッションを重率1で平均: 3.128, ステージゲームを重率1で平均: 2.895
16. "__main__.TFT" -> セッションを重率1で平均: 3.128, ステージゲームを重率1で平均: 2.895

WSLS: セッション平均: 3.257 ALLD: セッション平均: 3.046
TFT: セッション平均: 2.895

となった。これは、WSLS同士の対戦で得られる高い利得 > WSLSがALLDに吸い取られる利得となったため。

スコアテーブルを戦略のタイプ別に集計すると、

タイプ別平均
	WSLS	TFT	ALLD	Other kandori	total average
WSLS	3.121197699	3.1719659	1.597288599	3.241082345	3.230509666
TFT	3.156388186	2.745320385	2.102810448	3.289310268	3.16691979
ALLD	3.568115816	2.766558346		3.474086488	3.326308014
Other kandori	2.843888455	2.874368906	1.404262478	2.482345208	2.773649588

となった。神取ゼミのWSLS, TFT, ALLD以外の8戦略（Other kandori）は3戦略にそれほど大きな影響を与えていないことがわかる。したがって、3タイプだけで元実験を近似できている。

一般に、WSLSが多く、ALLDが少ない環境では、WSLSは高い利得を得られる。
特に戦略18は、通常のWSLSに比べてALLDに強く、1位になった要因だと考えられる。

Strategy18:

実験5

実験5では、34戦略の内、TFTに類似した戦略が11, WSLSが9, ALLDが1つであった。
1位は「過去のシグナルのうち20%以上がBならD, それ以外ならC」という戦略、2位は2T2FTであった。全体的にTFTが高利得、ALLDとWSLSは低利得となった。

タイプごとに集計した利得は、

戦略のタイプ	平均利得（セッション平均）
WSLS	3.20392351
TFT	3.254883021
ALLD	3.223658906

実験4のケースと同様に、TFT×11, WSLS×9, ALLD×1 で実験してみると、



In [27]:

    
strategies = [TFT, TFT, TFT, TFT, TFT, TFT, TFT, TFT, TFT, TFT, TFT, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, WSLS, ALLD]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 21 strategy functions below
--------------------------------------------------
1. __main__.TFT
2. __main__.TFT
3. __main__.TFT
4. __main__.TFT
5. __main__.TFT
6. __main__.TFT
7. __main__.TFT
8. __main__.TFT
9. __main__.TFT
10. __main__.TFT
11. __main__.TFT
12. __main__.WSLS
13. __main__.WSLS
14. __main__.WSLS
15. __main__.WSLS
16. __main__.WSLS
17. __main__.WSLS
18. __main__.WSLS
19. __main__.WSLS
20. __main__.WSLS
21. __main__.ALLD
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 0.     3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  0.     3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  0.     3.225  3.225  3.225  3.225  3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  0.     3.225  3.225  3.225  3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  3.225  0.     3.225  3.225  3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  3.225  3.225  0.     3.225  3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  3.225  3.225  3.225  0.     3.225  3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  3.225  3.225  3.225  3.225  0.     3.225  3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  0.     3.225
   3.225  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241
   1.625]
 [ 3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  0.     3.225
   3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  3.225  0.
   3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  3.241  1.625]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  0.     3.783  3.783  3.783  3.783  3.783  3.783  3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  0.     3.783  3.783  3.783  3.783  3.783  3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  0.     3.783  3.783  3.783  3.783  3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  3.783  0.     3.783  3.783  3.783  3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  3.783  3.783  0.     3.783  3.783  3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  3.783  3.783  3.783  0.     3.783  3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  3.783  3.783  3.783  3.783  0.     3.783  3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  3.783  3.783  3.783  3.783  3.783  0.     3.783
   0.932]
 [ 3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218  3.218
   3.218  3.783  3.783  3.783  3.783  3.783  3.783  3.783  3.783  0.     0.932]
 [ 2.562  2.562  2.562  2.562  2.562  2.562  2.562  2.562  2.562  2.562
   2.562  3.602  3.602  3.602  3.602  3.602  3.602  3.602  3.602  3.602  0.   ]]

各ステージゲームを重率1で平均した得点
[[ 0.     2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  0.     2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  0.     2.965  2.965  2.965  2.965  2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  0.     2.965  2.965  2.965  2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  2.965  0.     2.965  2.965  2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  0.     2.965  2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  2.965  0.     2.965  2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  2.965  2.965  0.     2.965  2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  0.     2.965
   2.965  2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98
   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  0.     2.965
   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  2.965  0.
   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   2.98   1.78 ]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  0.     3.756  3.756  3.756  3.756  3.756  3.756  3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  0.     3.756  3.756  3.756  3.756  3.756  3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  0.     3.756  3.756  3.756  3.756  3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  3.756  0.     3.756  3.756  3.756  3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  3.756  3.756  0.     3.756  3.756  3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  3.756  3.756  3.756  0.     3.756  3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  3.756  3.756  3.756  3.756  0.     3.756  3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  3.756  3.756  3.756  3.756  3.756  0.     3.756
   0.984]
 [ 2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972  2.972
   2.972  3.756  3.756  3.756  3.756  3.756  3.756  3.756  3.756  0.     0.984]
 [ 2.33   2.33   2.33   2.33   2.33   2.33   2.33   2.33   2.33   2.33
   2.33   3.524  3.524  3.524  3.524  3.524  3.524  3.524  3.524  3.524  0.   ]]

Ranking:
1. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
2. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
3. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
4. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
5. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
6. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
7. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
8. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
9. "__main__.WSLS" -> セッションを重率1で平均: 3.330, ステージゲームを重率1で平均: 3.186
10. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
11. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
12. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
13. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
14. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
15. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
16. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
17. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
18. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
19. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
20. "__main__.TFT" -> セッションを重率1で平均: 3.152, ステージゲームを重率1で平均: 2.913
21. "__main__.ALLD" -> セッションを重率1で平均: 3.030, ステージゲームを重率1で平均: 2.867

WSLS: セッション平均 3.330
TFT: セッション平均 2.913
ALLD: セッション平均 2.867

となり、実験5とは異なる結果となった。

タイプ別セッション平均
	WSLS	TFT	ALLD	Other kandori	Other oyama	total average
WSLS	3.121197699	3.294184513	1.597288599	3.241082345	2.844703858	3.20392351
TFT	3.217917263	3.135976169	1.999373767	3.280819084	3.198661754	3.254883021
ALLD	3.568115816	2.947953446		3.474086488	2.80950435	3.223658906
Other kandori	3.281705328	3.384906732	1.61449243	2.850495713	2.977608253	3.203621498
Other oyama	3.290522046	3.369423514	2.085169256	3.355591673	2.611990254	3.277145846

スコアテーブルを戦略のタイプごとに集計し直すと、実験5で「WSLSの利得 < TFTの利得」となった要因は、尾山ゼミのWSLS, TFT, ALLDでない残りの5戦略が原因であることがわかる。

検証2 「過去全ての履歴の内◯◯%以上BならDを出す戦略」は安定して高い利得を得られるか

Prob :=「過去全ての履歴の内◯◯%以上BならDを出す戦略」は、どのモニタリングタイプの実験でも安定して高順位。なぜ？

TFT:
WSLS:

各プレイヤーの戦略がTFT, WSLS, ALLD, Probのみの場合を考える。

Prob vs TFT
間違ったシグナルが出ない限り、協調を続ける。
Probだけに間違ったシグナルが出た場合、現在の期数によっては、両者ともDを出し続ける経路へ移る。
TFTだけに間違ったシグナルが出た場合, 両者ともに間違ったシグナルが出た場合もほぼ同様。
Prob vs WSLS
間違ったシグナルが出ないかぎり、協調を続ける。
Probに間違ったシグナルが出た場合、現在の期数によっては、ProbはDを出し続ける経路へ移る。この時、WSLSはCとDを交互に出す経路に行くので、ProbはWSLSから多くの利得を奪うことが出来る（ALLDがWSLSに対して有利なのと同様）。
更にWSLSのみ、あるいは両者に間違ったシグナルが出た場合、現在の期数によらず、しばらくWSLSがDを出し続けた後、上述の均衡へ必ず移行する。
Prob vs ALLD
両者ともほぼずっとDを出し続ける均衡で落ち着く。
Prob vs Prob
間違ったシグナルが出ないかぎり、協調を続ける。
早い期で間違ったシグナルが出た場合は両者Dを出しあう均衡へ移行する。そうでない場合はCを出し続ける。

つまり、ProbはTFT, ALLDおよびProb同士との対戦ではほぼ同等の利得を得、さらにWSLSとの対戦では多くの利得を得ることが出来る。したがって、各プレイヤーの戦略がこの4つのタイプのみに分類される場合、Probは（Dに移行する確率が何%であれ）悪くない戦略だと考えられる。ただし実際にどの戦略が1位になるかは、全ての戦略に占める各タイプの割合による。

タイプ別平均
	WSLS	TFT	ALLD	Prob	Other kandori	Other oyama	total average
WSLS	3.121197699	3.294184513	1.597288599	2.559611894	3.241082345	2.915976849	3.20392351
TFT	3.217917263	3.135976169	1.999373767	2.99366014	3.280819084	3.249912158	3.254883021
ALLD	3.568115816	2.947953446		2.536744069	3.474086488	2.87769442	3.223658906
Prob	3.553529215	3.278944542	2.286598783		3.513118831	3.178283059	3.36832832
Other kandori	3.281705328	3.384906732	1.61449243	2.727741801	2.850495713	3.040074867	3.203621498
Other oyama	3.224770254	3.392043256	2.034811875	2.927390992	3.316209883	2.55100466	3.254350228

実験5のスコアテーブルを再度タイプ別に集計すると、ProbはTFT, ALLDとの対戦でそれなりの利得を得、WSLSに対してはALLDなみに高い利得をえていることがわかる。



In [ ]: