囚人のジレンマゲームの実験3 Appendix1

各戦略が自分自身と対戦することを許す （自分自身との対戦無しのケース）

実験の概要: README.md

実験1: 完全観測
実験2: 不完全公的観測
実験3: 不完全私的観測（尾山ゼミの戦略）
実験4: 不完全私的観測（神取ゼミの戦略）
実験5: 不完全私的観測（神取, 尾山ゼミの戦略）

利得表

<table align="center", style="text-align:center;"> 自分の行動, 相手の行動行動0（active）行動1（inactive）行動0（active） 4, 4 0, 5 行動1（inactive） 5, 0 2, 2 </table>



In [21]:

    
#-*- encoding: utf-8 -*-
%matplotlib inline
from IPython.display import display, HTML
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import pandas as pd
import scipy.stats as stats
np.set_printoptions(precision=3)
np.set_printoptions(linewidth=300)
pd.set_option('display.max_columns', 30)
pd.set_option('display.width', 400)
pd.set_option('display.precision', 4)
import sys
sys.path.append('./user_strategies')
# 日本語対応
mpl.rcParams['font.family'] = 'Osaka'
plt.rcParams['font.size'] = 14
import play as pl
from Iida_perfect_monitoring import Iida_pm
from Iida_imperfect_public import Iida_ipm
from Iida_imperfect_private import Iida_iprm
from kato import KatoStrategy
from ikegami_perfect import Self_Centered_perfect
from ikegami_imperfect_public import Self_Centered_public
from ikegami_imperfect_private import Self_Centered_private
from mhanami_Public_Strategy import PubStrategy
from mhanami_Imperfect_Public_Strategy import ImPubStrategy
from mhanami_Imperfect_Private_Strategy import ImPrivStrategy
from tsuyoshi import GrimTrigger
from gistfile1 import MyStrategy
from beeleb_Strategy import beeleb
from oyama import OyamaPerfectMonitoring, OyamaImperfectPublicMonitoring, OyamaImperfectPrivateMonitoring
from ogawa import ogawa
from yamagishi_impd import yamagishi
from kandori import *

Test

各戦略のテスト



In [6]:

    
import unittest

class TestStrategies(unittest.TestCase):
    def setUp(self):
        self.Strategies = [Iida_pm, Iida_ipm, Iida_iprm, KatoStrategy, Self_Centered_perfect, \
                          Self_Centered_public, Self_Centered_private, PubStrategy, ImPubStrategy, ImPrivStrategy, \
                          MyStrategy, beeleb, OyamaPerfectMonitoring, \
                           OyamaImperfectPublicMonitoring, OyamaImperfectPrivateMonitoring, \
                          ogawa, yamagishi, GrimTrigger, Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24, ] # ここに自作のclassを入れる
        self.case1 = "Signal is empty(period 1)"
        self.case2 = [0, 1]
        self.case3 = [1, 0]
        self.case4 = [0, 1, 0, 1, 0, 0, 1]
        self.seed = 222
        self.RandomState = np.random.RandomState(self.seed)


    # case1を引数に渡してテスト
    def test1(self):
        print("testcase:", self.case1)
        for Strategy in self.Strategies:
            rst = Strategy(self.RandomState).play()
            self.assertIsNotNone(rst, Strategy.__module__)
            self.assertIn(rst, (0, 1), Strategy.__module__)

    # case2を引数に渡してテスト
    def test2(self):
        print("testcase:", self.case2)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case2:
                rst = S.play()
                S.get_signal(signal)
                self.assertIsNotNone(rst, Strategy.__module__)
                self.assertIn(rst, (0, 1), Strategy.__module__)

    # case3を引数に渡してテスト
    def test3(self):
        print("testcase:", self.case3)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case3:
                rst = S.play()
                S.get_signal(signal)
            
            self.assertIsNotNone(rst, S.__module__)
            self.assertIn(rst, (0, 1), S.__module__)

    # case4を引数に渡してテスト
    def test4(self):
        print("testcase:", self.case4)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case4:
                rst = S.play()
                S.get_signal(signal)
                self.assertIsNotNone(rst, S.__module__)
                self.assertIn(rst, (0, 1), S.__module__)



In [7]:

    
suite = unittest.TestLoader().loadTestsFromTestCase(TestStrategies)
unittest.TextTestRunner().run(suite)









    



....





    



testcase: Signal is empty(period 1)
testcase: [0, 1]
testcase: [1, 0]
testcase: [0, 1, 0, 1, 0, 0, 1]






    



----------------------------------------------------------------------
Ran 4 tests in 0.004s

OK






    Out[7]:





<unittest.runner.TextTestResult run=4 errors=0 failures=0>

Test: OK

実験のセットアップ



In [82]:

    
payoff = np.array([[4, 0], [5, 2]])
seed = 282
rs = np.random.RandomState(seed)
discount_v = 0.97
repeat = 1000
ts_length = rs.geometric(p=1-discount_v, size=1000)

Case1: perfect monitoring

自分自身との対戦無しのケース

結果の生データ(csv)は contest1/data
戦略はuser_strategies
戦略のオートマトンはcontest1/automaton1.pdf



In [9]:

    
strategies = [Iida_pm, PubStrategy, KatoStrategy, Self_Centered_perfect,
                       GrimTrigger, MyStrategy, beeleb, OyamaPerfectMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=None, ts_length=ts_length, repeat=1000)
game.play(mtype="perfect", random_seed=seed, record=False)









    



Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_perfect_monitoring.Iida_pm
2. mhanami_Public_Strategy.PubStrategy
3. kato.KatoStrategy
4. ikegami_perfect.Self_Centered_perfect
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaPerfectMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.456  3.396  2.514  3.952  3.807  4.146  4.146  3.394  3.588  3.803]
 [ 3.519  4.     2.428  4.     4.     4.     4.     4.     3.315  4.   ]
 [ 2.912  2.234  2.229  2.292  3.408  3.973  3.814  2.234  3.641  2.906]
 [ 3.463  4.     2.46   4.     4.     4.     4.     4.     3.459  4.   ]
 [ 3.292  4.     1.893  4.     4.     4.     4.     4.     3.374  4.   ]
 [ 3.415  4.     2.31   4.     4.     4.     4.     4.     3.479  4.   ]
 [ 3.415  4.     2.416  4.     4.     4.     4.     4.     3.534  4.   ]
 [ 3.518  4.     2.428  4.     4.     4.     4.     4.     3.315  4.   ]
 [ 3.257  3.254  2.501  3.904  3.792  3.897  3.815  3.254  3.612  3.643]
 [ 3.784  4.     2.69   4.     4.     4.     4.     4.     3.595  4.   ]]

各ステージゲームを重率1で平均した得点
[[ 2.93   2.794  2.198  3.695  3.627  4.285  4.285  2.788  3.107  3.702]
 [ 3.055  4.     2.17   4.     4.     4.     4.     4.     2.82   4.   ]
 [ 2.491  2.037  2.061  2.066  3.418  3.595  3.211  2.037  3.044  2.524]
 [ 3.009  4.     2.194  4.     4.     4.     4.     4.     2.926  4.   ]
 [ 2.569  4.     1.402  4.     4.     4.     4.     4.     2.708  4.   ]
 [ 2.859  4.     1.93   4.     4.     4.     4.     4.     2.908  4.   ]
 [ 2.859  4.     2.186  4.     4.     4.     4.     4.     3.072  4.   ]
 [ 3.051  4.     2.17   4.     4.     4.     4.     4.     2.82   4.   ]
 [ 2.805  2.767  2.276  3.618  3.684  3.666  3.418  2.767  3.182  3.217]
 [ 3.686  4.     2.418  4.     4.     4.     4.     4.     3.157  4.   ]]

Summary

Str No.	Strategy name	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	備考
Datetime	2015-12-28-05-37-48
Monitoring type	perfect
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	10
10	yamagishi_impd.yamagishi	3.8069881	1	3.726098734	1	TFT
4	ikegami_perfect.Self_Centered_perfect	3.738247097	2	3.612840881	2	30%
7	beeleb_Strategy.beeleb	3.736419018	3	3.611766496	3
2	mhanami_Public_Strategy.PubStrategy	3.72617118	4	3.604507548	4	TFT'
8	oyama.OyamaPerfectMonitoring	3.726026754	5	3.604087533	5	GT
6	gistfile1.MyStrategy	3.720316042	6	3.569763514	6	TFT'
5	tsuyoshi.GrimTrigger	3.655879793	7	3.467905405	7	TFT'
1	Iida_perfect_monitoring.Iida_pm	3.620227933	8	3.341106647	8
9	ogawa.ogawa	3.492903964	9	3.139997261	9
3	kato.KatoStrategy	2.964379561	10	2.648450816	10
average		3.618755944		3.432652484

戦略1と5のセッションベース順位が入れ替わったが、それ以外に大きな変動はなし。

戦略別セッション平均の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%



In [41]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest1/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)
for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

# boxplot
averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 8))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)
plt.show()

基本統計量



In [39]:

    
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]
display(statistics)

大きくは変わらず。

期数による平均利得の変化



In [40]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest1/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [10, 8, 4]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[10-1], color='red', linewidth=2, label="10 (TFT)")
plt.plot(t_list, average_matrix[8-1], color='blue', linewidth=2, label="8 (GrimTrigger)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (30%)")
plt.legend()
plt.show()

大きく変わらず。

Case2: imperfect public monitoring

自分自身との対戦無しのケース

結果の生データ(csv)は contest2/data
戦略はuser_strategies
戦略のオートマトンはcontest2/automaton2.pdf



In [32]:

    
# プロジェクトが成功か失敗かを返す
def public_signal(actions, random_state):
    prob = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return 0 if prob < 0.9 else 1
    elif (actions[0] == 0 and actions[1] == 1) or (actions[0] == 1 and actions[1] == 0):
        return 0 if prob < 0.5 else 1
    elif actions[0] == 1 and actions[1] == 1:
        return 0 if prob < 0.2 else 1
    else:
        raise ValueError

strategies = [Iida_ipm, ImPubStrategy, KatoStrategy, Self_Centered_public, GrimTrigger,
              MyStrategy, beeleb, OyamaImperfectPublicMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=public_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="public", random_seed=seed, record=False)









    



Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_imperfect_public.Iida_ipm
2. mhanami_Imperfect_Public_Strategy.ImPubStrategy
3. kato.KatoStrategy
4. ikegami_imperfect_public.Self_Centered_public
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaImperfectPublicMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.078  1.283  2.175  3.073  3.767  4.099  4.08   3.118  3.46   3.329]
 [ 3.076  2.     2.473  2.593  4.045  4.412  4.283  3.088  3.678  3.19 ]
 [ 3.005  1.684  2.452  2.824  3.821  4.251  4.185  3.113  3.621  3.202]
 [ 3.117  1.604  2.296  3.396  3.749  4.046  4.014  3.225  3.423  3.369]
 [ 2.743  0.636  1.742  3.017  3.613  3.999  4.062  2.927  3.319  3.42 ]
 [ 2.636  0.392  1.661  3.114  3.576  3.967  3.99   2.912  3.296  3.455]
 [ 2.621  0.478  1.753  3.128  3.539  3.969  3.995  2.954  3.342  3.43 ]
 [ 3.107  1.275  2.253  3.175  3.745  4.071  3.997  3.226  3.511  3.344]
 [ 2.892  0.881  2.005  3.175  3.668  4.049  4.002  3.126  3.542  3.392]
 [ 3.002  1.207  2.043  3.103  3.737  3.959  3.998  3.133  3.362  3.49 ]]

各ステージゲームを重率1で平均した得点
[[ 2.677  1.493  1.927  2.67   3.716  4.148  4.027  2.523  2.939  3.105]
 [ 2.761  2.     2.166  2.2    3.898  4.285  3.929  2.488  3.087  2.96 ]
 [ 2.725  1.89   2.229  2.427  3.791  4.238  3.992  2.568  3.125  2.99 ]
 [ 2.759  1.867  2.126  3.291  3.673  4.053  3.943  2.714  2.892  3.194]
 [ 2.177  0.735  1.311  2.705  3.514  3.99   4.053  2.206  2.618  3.303]
 [ 2.009  0.477  1.185  2.87   3.463  3.963  3.983  2.179  2.533  3.352]
 [ 2.058  0.714  1.396  3.039  3.435  3.963  3.993  2.304  2.652  3.346]
 [ 2.817  1.675  2.14   2.97   3.746  4.126  3.923  2.769  3.064  3.155]
 [ 2.571  1.275  1.866  2.931  3.656  4.111  3.942  2.646  3.11   3.2  ]
 [ 2.577  1.36   1.749  2.841  3.684  3.936  3.939  2.544  2.789  3.384]]

Summary

Str No.	Strategy name	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	備考
Datetime	2015-12-28-05-40-50
Monitoring type	public
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	10
2	mhanami_Imperfect_Public_Strategy.ImPubStrategy	3.283855312	1	2.977218773	4	ALLD
4	ikegami_imperfect_public.Self_Centered_public	3.223912647	2	3.051190041	1	25%
3	kato.KatoStrategy	3.215843985	3	2.997391648	3
8	oyama.OyamaImperfectPublicMonitoring	3.17047641	4	3.038606343	2	GT'
1	Iida_imperfect_public.Iida_ipm	3.146127068	5	2.922405345	6
10	yamagishi_impd.yamagishi	3.103362463	6	2.8803841	7	TFT
9	ogawa.ogawa	3.073304517	7	2.930679328	5
5	tsuyoshi.GrimTrigger	2.947823872	8	2.661101473	9	TFT'
7	beeleb_Strategy.beeleb	2.920922764	9	2.689937911	8
6	gistfile1.MyStrategy	2.900052583	10	2.601404614	10	TFT'
average		3.098568162		2.875031958

戦略2（ALLD）と戦略3（定期的にDを出す戦略）が上位となった。

戦略別セッション平均の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%



In [44]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest2/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 8))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量



In [45]:

    
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]
display(statistics)

実験1とは異なり, 分散と順位の間に明確な関係は見られない。

期数による平均利得の変化



In [46]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest2/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [2, 8, 4, 10]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[2-1], color='red', linewidth=2, label="2 (ALLD)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (25%)")
plt.plot(t_list, average_matrix[8-1], color='blue', linewidth=2, label="8 (GT’)")
plt.plot(t_list, average_matrix[10-1], color='orange', linewidth=2, label="10 (TFT)")
plt.legend()
plt.show()

上位の戦略は、期数の短長にかかわらず、安定した平均利得をえている。
ALLDは特に短い期数のセッションでの平均利得が大きく、1位になった要因だと考えられる。

Case3: imperfect private monitoring（尾山ゼミの戦略のみ）

自分自身との対戦無しのケース

結果の生データ(csv)は contest3/data
戦略はuser_strategies
戦略のオートマトンはcontest3/automaton3.pdf



In [47]:

    
# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Iida_iprm, ImPrivStrategy, KatoStrategy, Self_Centered_private, GrimTrigger,
              MyStrategy, beeleb, OyamaImperfectPrivateMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_imperfect_private.Iida_iprm
2. mhanami_Imperfect_Private_Strategy.ImPrivStrategy
3. kato.KatoStrategy
4. ikegami_imperfect_private.Self_Centered_private
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaImperfectPrivateMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.155  3.476  2.318  2.954  3.56   4.004  3.984  3.414  3.465  3.333]
 [ 3.241  3.803  2.406  3.338  3.626  3.974  3.977  3.582  3.507  3.36 ]
 [ 2.799  3.373  2.318  2.264  3.404  3.963  3.823  3.155  3.538  2.849]
 [ 3.288  3.475  2.398  3.319  3.68   3.84   3.76   3.472  3.267  3.19 ]
 [ 2.922  3.555  1.92   2.801  3.597  3.904  3.823  3.516  3.283  3.479]
 [ 3.11   3.934  2.205  3.255  3.567  3.986  3.998  3.687  3.429  3.656]
 [ 3.126  3.935  2.336  3.402  3.487  3.985  3.999  3.674  3.503  3.679]
 [ 3.227  3.634  2.375  3.249  3.524  3.924  3.89   3.558  3.407  3.415]
 [ 3.122  3.688  2.42   3.09   3.543  3.916  3.832  3.506  3.567  3.39 ]
 [ 3.384  3.422  2.478  3.086  3.656  3.96   4.022  3.525  3.48   3.225]]

各ステージゲームを重率1で平均した得点
[[ 2.759  3.172  1.946  2.491  3.394  4.003  3.886  3.075  2.939  3.125]
 [ 2.899  3.625  2.21   3.203  3.526  3.959  3.964  3.363  3.047  3.141]
 [ 2.63   2.803  2.123  2.049  3.392  3.699  3.253  2.733  2.991  2.542]
 [ 2.989  3.268  2.164  3.172  3.554  3.772  3.596  3.183  2.792  2.924]
 [ 2.417  3.345  1.457  2.285  3.479  3.862  3.71   3.313  2.621  3.345]
 [ 2.684  3.902  1.785  3.092  3.439  3.982  3.999  3.609  2.814  3.597]
 [ 2.759  3.911  2.113  3.365  3.336  3.982  3.999  3.535  3.017  3.629]
 [ 2.895  3.399  2.115  3.043  3.365  3.891  3.77   3.336  2.909  3.217]
 [ 2.817  3.267  2.2    2.719  3.452  3.752  3.471  3.134  3.136  3.048]
 [ 3.193  3.172  2.204  2.795  3.543  3.948  3.996  3.292  3.008  2.965]]

Summary

Str No.	Strategy name	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	備考
Datetime	2015-12-28-05-44-20
Monitoring type	private
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	10
7	beeleb_Strategy.beeleb	3.512547784	1	3.364516374	1
6	gistfile1.MyStrategy	3.482755316	2	3.29020879	3	TFT'
2	mhanami_Imperfect_Private_Strategy.ImPrivStrategy	3.481391663	3	3.293705868	2	2T2FT
10	yamagishi_impd.yamagishi	3.42390276	4	3.211745191	4	TFT
8	oyama.OyamaImperfectPrivateMonitoring	3.420404393	5	3.193991965	5	TFT'
9	ogawa.ogawa	3.407284046	6	3.099491721	7
4	ikegami_imperfect_private.Self_Centered_private	3.368936793	7	3.14127861	6	20%
1	Iida_imperfect_private.Iida_iprm	3.366075611	8	3.079022096	8
5	tsuyoshi.GrimTrigger	3.28006289	9	2.983293767	9	TFT'
3	kato.KatoStrategy	3.148614675	10	2.821551619	10
average		3.389197593		3.1478806

戦略別, セッション平均利得の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%



In [48]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest3/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 8))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量



In [49]:

    
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]
display(statistics)

期数による平均利得の変化



In [50]:

    
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest3/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [2, 7, 4, 10]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[7-1], color='red', linewidth=2, label="7")
plt.plot(t_list, average_matrix[10-1], color='orange', linewidth=2, label="10 (TFT)")
plt.plot(t_list, average_matrix[2-1], color='blue', linewidth=2, label="2 (2T2FT)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (20%)")
plt.legend()
plt.show()

期数が長くなるに従って、協調がしづらくなっていることがわかる。TFT同士の対戦では、このようなことが一般に起こる（後述）

Case4: imperfect private monitoring（神取ゼミの戦略のみ）

自分自身との対戦無しのケース

結果の生データ(csv)は contest4/data
戦略は user_strategies
戦略のオートマトンは contest4/automaton4.pdf



In [51]:

    
# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24]
    
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 24 strategy functions below
--------------------------------------------------
1. kandori.Strategy1
2. kandori.Strategy2
3. kandori.Strategy3
4. kandori.Strategy4
5. kandori.Strategy5
6. kandori.Strategy6
7. kandori.Strategy7
8. kandori.Strategy8
9. kandori.Strategy9
10. kandori.Strategy10
11. kandori.Strategy11
12. kandori.Strategy12
13. kandori.Strategy13
14. kandori.Strategy14
15. kandori.Strategy15
16. kandori.Strategy16
17. kandori.Strategy17
18. kandori.Strategy18
19. kandori.Strategy19
20. kandori.Strategy20
21. kandori.Strategy21
22. kandori.Strategy22
23. kandori.Strategy23
24. kandori.Strategy24
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.347  3.703  3.614  3.793  3.834  3.909  3.041  3.817  2.363  3.03   3.513  1.965  1.682  3.667  3.171  3.703  3.388  2.67   3.162  3.396  3.69   3.774  4.     2.011]
 [ 3.533  3.783  3.686  3.59   3.66   3.935  3.075  3.598  2.473  3.066  3.64   1.645  1.577  3.718  3.124  3.783  3.335  2.484  3.218  3.414  3.766  3.799  3.972  1.873]
 [ 3.538  3.777  3.688  3.617  3.688  3.946  3.075  3.596  2.419  3.059  3.613  1.707  1.651  3.719  3.17   3.777  3.29   2.652  3.212  3.436  3.767  3.81   3.981  2.027]
 [ 3.156  3.161  3.161  3.997  3.636  3.748  3.43   3.981  3.177  3.379  3.895  1.902  1.671  3.501  2.671  3.161  3.768  3.17   3.654  2.874  3.272  3.474  3.983  1.577]
 [ 2.397  3.332  3.203  4.065  3.798  3.754  3.314  3.91   3.007  3.278  3.836  1.875  1.241  3.365  2.747  3.332  3.615  1.892  3.434  3.025  3.341  3.382  4.053  1.28 ]
 [ 3.076  3.599  3.498  3.921  3.649  3.908  3.346  3.678  3.113  3.292  3.825  1.911  1.284  3.633  2.869  3.599  3.596  2.328  3.575  3.174  3.609  3.627  3.981  1.367]
 [ 2.855  3.146  3.079  4.021  3.748  3.782  3.067  3.957  2.502  3.05   3.715  2.044  2.073  3.458  2.948  3.146  3.379  2.808  3.148  3.038  3.239  3.644  3.998  2.063]
 [ 3.17   3.096  3.028  3.991  3.655  3.627  3.416  3.983  3.187  3.363  3.871  2.018  1.535  3.278  2.69   3.096  3.755  2.682  3.656  2.858  3.146  3.199  3.982  1.496]
 [ 2.515  2.761  2.66   4.073  3.802  3.938  2.87   4.078  2.517  2.874  3.702  2.125  2.278  3.435  2.886  2.761  3.426  2.341  2.798  2.82   2.903  3.554  4.064  2.369]
 [ 2.839  3.135  3.062  4.027  3.773  3.764  3.067  3.948  2.503  3.048  3.712  2.03   2.052  3.427  2.95   3.135  3.364  2.783  3.133  3.034  3.21   3.611  4.006  2.057]
 [ 3.131  3.1    3.048  4.009  3.564  3.819  3.334  3.978  2.997  3.285  3.8    1.96   1.972  3.539  2.716  3.1    3.626  2.826  3.497  2.862  3.211  3.626  3.952  1.525]
 [ 2.889  3.347  3.242  4.056  4.07   3.471  2.785  3.801  2.218  2.816  3.361  2.248  2.359  3.25   3.326  3.347  2.865  2.781  2.684  3.322  3.314  3.275  4.07   2.884]
 [ 3.448  3.601  3.491  3.479  4.124  4.059  2.846  3.682  2.264  2.875  3.028  2.24   2.394  3.655  3.518  3.601  2.865  2.989  2.722  3.574  3.624  3.945  4.124  2.75 ]
 [ 3.321  3.687  3.589  3.825  3.693  3.929  3.233  3.609  2.86   3.186  3.736  1.889  1.548  3.68   3.065  3.687  3.476  2.574  3.447  3.305  3.674  3.738  3.987  1.896]
 [ 3.111  3.578  3.428  3.704  3.885  3.891  3.034  3.609  2.52   3.019  3.579  1.761  1.606  3.46   3.125  3.578  3.206  2.153  3.079  3.327  3.556  3.653  4.022  1.952]
 [ 3.533  3.783  3.686  3.59   3.66   3.935  3.075  3.598  2.473  3.066  3.64   1.645  1.577  3.718  3.124  3.783  3.335  2.484  3.218  3.414  3.766  3.799  3.972  1.873]
 [ 3.256  3.248  3.184  3.962  3.682  3.705  3.235  3.963  2.856  3.199  3.706  2.159  2.08   3.499  2.908  3.248  3.513  2.986  3.368  3.038  3.334  3.401  3.978  1.96 ]
 [ 3.489  3.69   3.605  3.86   3.939  4.001  3.026  3.584  2.284  3.027  3.462  1.981  1.985  3.73   3.403  3.69   3.215  3.508  3.104  3.525  3.697  3.923  4.047  2.865]
 [ 3.074  3.241  3.2    3.987  3.631  3.829  3.091  3.982  2.46   3.07   3.695  2.09   2.163  3.597  2.941  3.241  3.425  2.97   3.225  3.057  3.349  3.791  3.942  2.081]
 [ 3.324  3.696  3.58   3.624  3.787  3.921  3.059  3.583  2.5    3.041  3.604  1.713  1.584  3.593  3.155  3.696  3.242  2.256  3.129  3.391  3.684  3.734  3.999  1.929]
 [ 3.484  3.764  3.677  3.672  3.673  3.939  3.146  3.612  2.557  3.108  3.678  1.716  1.564  3.705  3.123  3.764  3.401  2.551  3.3    3.397  3.751  3.788  3.981  1.885]
 [ 3.454  3.765  3.634  3.868  3.657  3.927  3.298  3.547  2.947  3.249  3.761  1.905  1.348  3.64   3.055  3.765  3.423  2.26   3.552  3.372  3.734  3.795  3.972  1.826]
 [ 2.873  3.548  3.442  3.991  3.639  3.835  3.426  3.973  3.175  3.371  3.888  1.875  1.24   3.584  2.747  3.548  3.765  2.303  3.628  3.07   3.558  3.578  3.985  1.279]
 [ 3.314  3.652  3.358  3.342  4.149  4.069  2.751  3.41   2.316  2.774  3.377  1.67   1.84   3.346  3.428  3.652  2.832  1.753  2.674  3.523  3.578  3.689  4.149  2.874]]

各ステージゲームを重率1で平均した得点
[[ 2.979  3.652  3.533  3.575  3.9    3.908  2.762  3.428  2.42   2.771  3.176  1.852  1.442  3.597  3.189  3.652  3.017  2.232  2.79   3.345  3.64   3.775  4.019  2.362]
 [ 3.202  3.756  3.641  3.351  3.587  3.915  2.869  2.897  2.652  2.869  3.491  1.348  1.173  3.666  3.065  3.756  3.047  1.9    2.972  3.362  3.743  3.775  3.963  2.017]
 [ 3.251  3.746  3.637  3.379  3.625  3.921  2.86   2.902  2.592  2.848  3.44   1.426  1.272  3.655  3.091  3.746  2.983  2.172  2.944  3.366  3.729  3.785  3.97   2.193]
 [ 2.667  2.946  2.92   3.995  3.56   3.621  3.355  3.903  3.456  3.295  3.856  1.746  1.419  3.367  2.571  2.946  3.662  3.024  3.592  2.695  3.089  3.417  3.979  1.897]
 [ 1.615  3.21   3.02   4.074  3.755  3.645  3.211  3.485  3.225  3.174  3.752  1.713  0.798  3.191  2.63   3.21   3.45   1.169  3.331  2.887  3.205  3.28   4.061  1.428]
 [ 2.482  3.561  3.427  3.874  3.575  3.873  3.234  3.101  3.379  3.176  3.744  1.737  0.85   3.573  2.785  3.561  3.357  1.609  3.503  3.09   3.57   3.598  3.974  1.532]
 [ 2.354  2.903  2.823  3.986  3.685  3.693  2.861  3.613  2.683  2.854  3.562  1.878  1.806  3.31   2.813  2.903  3.121  2.458  2.911  2.846  3.026  3.593  3.985  2.232]
 [ 2.819  3.076  2.971  3.963  3.644  3.553  3.231  3.914  3.345  3.182  3.748  1.963  1.553  3.186  2.86   3.076  3.549  2.476  3.49   2.929  3.109  3.285  3.982  2.404]
 [ 2.205  2.711  2.606  3.957  3.627  3.82   2.762  3.756  2.659  2.756  3.483  1.961  1.952  3.398  2.772  2.711  3.127  2.223  2.726  2.745  2.879  3.667  3.966  2.327]
 [ 2.318  2.902  2.812  3.99   3.713  3.667  2.861  3.575  2.677  2.855  3.563  1.857  1.777  3.27   2.811  2.902  3.098  2.441  2.903  2.84   2.987  3.555  3.996  2.226]
 [ 2.687  2.805  2.72   4.013  3.448  3.727  3.201  3.839  3.19   3.153  3.71   1.815  1.752  3.383  2.575  2.805  3.424  2.374  3.35   2.637  2.966  3.59   3.932  1.714]
 [ 2.546  3.328  3.193  3.862  3.937  3.278  2.564  3.128  2.282  2.606  3.094  2.075  2.09   3.101  3.256  3.328  2.506  2.65   2.415  3.282  3.273  3.19   3.937  3.207]
 [ 3.131  3.534  3.385  3.172  4.102  4.025  2.58   2.971  2.289  2.624  2.672  2.09   2.12   3.556  3.502  3.534  2.464  2.792  2.411  3.524  3.543  3.963  4.102  3.224]
 [ 2.905  3.65   3.516  3.726  3.628  3.9    3.08   2.962  3.116  3.033  3.628  1.672  1.16   3.614  2.981  3.65   3.237  2.033  3.316  3.234  3.639  3.711  3.979  2.067]
 [ 2.567  3.518  3.321  3.507  3.82   3.824  2.85   2.788  2.689  2.838  3.425  1.511  1.186  3.351  3.032  3.518  2.937  1.663  2.867  3.243  3.473  3.608  3.992  2.132]
 [ 3.202  3.756  3.641  3.351  3.587  3.915  2.869  2.897  2.652  2.869  3.491  1.348  1.173  3.666  3.065  3.756  3.047  1.9    2.972  3.362  3.743  3.775  3.963  2.017]
 [ 2.846  3.007  2.913  3.941  3.629  3.502  3.059  3.771  2.991  3.025  3.54   2.044  1.89   3.303  2.816  3.007  3.201  2.604  3.149  2.869  3.124  3.252  3.969  2.216]
 [ 3.159  3.604  3.498  3.792  3.996  3.997  2.768  2.919  2.394  2.787  3.193  1.749  1.667  3.635  3.408  3.604  2.825  3.425  2.772  3.491  3.615  3.938  4.062  3.182]
 [ 2.592  2.98   2.919  3.946  3.567  3.794  2.882  3.767  2.65   2.875  3.537  1.96   1.921  3.487  2.82   2.98   3.178  2.592  2.965  2.871  3.14   3.764  3.933  2.253]
 [ 2.864  3.659  3.495  3.391  3.723  3.884  2.866  2.801  2.678  2.854  3.444  1.439  1.176  3.505  3.067  3.659  2.96   1.7    2.908  3.321  3.626  3.707  3.981  2.084]
 [ 3.124  3.739  3.629  3.478  3.598  3.913  2.953  2.947  2.771  2.922  3.533  1.432  1.168  3.653  3.037  3.739  3.132  1.976  3.087  3.329  3.715  3.768  3.969  2.03 ]
 [ 3.062  3.732  3.573  3.812  3.581  3.903  3.188  2.78   3.281  3.133  3.697  1.721  0.887  3.569  2.97   3.732  3.2    1.549  3.485  3.307  3.704  3.769  3.964  1.959]
 [ 2.181  3.49   3.358  3.989  3.563  3.768  3.357  3.852  3.46   3.298  3.842  1.713  0.798  3.499  2.628  3.49   3.672  1.569  3.58   2.955  3.501  3.531  3.982  1.427]
 [ 2.868  3.589  3.255  2.982  4.061  3.968  2.582  2.312  2.408  2.604  3.167  1.322  1.288  3.238  3.308  3.589  2.54   1.415  2.498  3.444  3.51   3.634  4.063  2.922]]

Summary

Str No.	Strategy name	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	備考
Datetime	2015-12-29-07-01-49
Monitoring type	private
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	24
18	kandori.Strategy18	3.359977969	1	3.228364555	1	WSLS'
13	kandori.Strategy13	3.287460052	2	3.137979618	4	CCDDDD
22	kandori.Strategy22	3.281387516	3	3.148252349	2
14	kandori.Strategy14	3.276439015	4	3.143196829	3	WSLS'
1	kandori.Strategy1	3.26006426	5	3.12564803	6
21	kandori.Strategy21	3.259766841	6	3.110083267	8	WSLS'
3	kandori.Strategy3	3.259031214	7	3.105561009	9	WSLS'
16	kandori.Strategy16	3.239432858	8	3.084025749	13	WSLS
2	kandori.Strategy2	3.239432858	9	3.084025749	12	WSLS
17	kandori.Strategy17	3.227942348	10	3.069401327	14	TFT'
6	kandori.Strategy6	3.227373162	11	3.09021437	11	WSLS'
4	kandori.Strategy4	3.22501653	12	3.124100875	7
23	kandori.Strategy23	3.22174583	13	3.104321382	10
19	kandori.Strategy19	3.213839502	14	3.057152803	15	TFT
20	kandori.Strategy20	3.201023476	15	3.032889731	17	WSLS'
11	kandori.Strategy11	3.186472045	16	3.033648725	16	TFT'
7	kandori.Strategy7	3.16280871	17	2.995780197	19	TFT'
15	kandori.Strategy15	3.159724338	18	2.985872697	20	WSLS'
12	kandori.Strategy12	3.157521914	19	3.005357342	18
8	kandori.Strategy8	3.157403557	20	3.137817294	5	HIST
10	kandori.Strategy10	3.152953322	21	2.983149323	21	TFT'
24	kandori.Strategy24	3.146712115	22	2.940208156	23
5	kandori.Strategy5	3.103164667	23	2.938215876	24
9	kandori.Strategy9	3.06455783	24	2.949860122	22	STFT
average		3.211302164		3.067296974

戦略別, セッション平均の分布



In [52]:

    
rounds = 1000 * 2
strategies = 24
max_ts = 100

# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(22, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.1, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量



In [75]:

    
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]

display(statistics.iloc[:, :12])
display(statistics.iloc[:, 12:])

期数による平均利得の変化



In [76]:

    
rounds = 1000 * 2
strategies = 24
max_ts = 100

# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [18, 13, 2, 19, 9]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[18-1], color='red', linewidth=2, label="18 (WSLS’)")
plt.plot(t_list, average_matrix[13-1], color='orange', linewidth=2, label="13 (CCDDDD)")
plt.plot(t_list, average_matrix[2-1], color='blue', linewidth=2, label="2 (WSLS)")
plt.plot(t_list, average_matrix[19-1], color='green', linewidth=2, label="19 (TFT)")
plt.plot(t_list, average_matrix[9-1], color='purple', linewidth=2, label="9 (STFT)")
plt.legend()
plt.show()

トリム平均

セッション・ベース平均から、期数の短いセッション・長いセッションそれぞれ5%ずつを除外して平均を求める
※区間の端にタイがある場合は、重みを調整する（例: 48位: 1, 49位: 2, 50位: 2, 51位: 2, 52位: 3なら、49位〜51位の平均利得の和を1/3倍して計算する）



In [88]:

    
def trim_mean(ts_length, aves, width):
    size = ts_length.size
    hist = {}
    for t in ts_length:
        hist[t] = hist.get(t, 0) + 1
    lower_b = size * (1-width) / 2
    upper_b = size * (1 - (1-width)/2)
    s = 0
    total = 0
    for ts, num in sorted(hist.items()):
        old_s = s
        s += num
        if old_s <= lower_b < s:
            total += (s-lower_b) * aves[ts-1]
        elif old_s <= upper_b < s:
            total += (upper_b-old_s+1) * aves[ts-1]
        elif lower_b <= s <= upper_b:
            total += num * aves[ts-1]
        elif s > upper_b:
            break
    return total / (size * width)

rounds = 1000 * 2
strategies = 24
max_ts = ts_length.max()
    
# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜max期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)
for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

for i in range(strategies):
    print(trim_mean(ts_length, average_matrix[i], 0.9))









    



3.25475683207
3.23456126734
3.25504114444
3.21601949933
3.09356509615
3.22307477507
3.15391179269
3.13443489517
3.0103502954
3.14344595412
3.18095405612
3.13667270937
3.28599544834
3.27353009013
3.14903729919
3.23456126734
3.22578880067
3.3599210819
3.20719064471
3.19433168715
3.25648101803
3.27722858835
3.21425669986
3.09324189293

Str No.	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	Average(90% trimmed)	Rank(trimmed)	備考
18	3.35352416	1	3.219810292	1	3.354223602	1	WSLS'
13	3.326308014	2	3.182248494	2	3.327682945	2	CCDDDD
22	3.259068663	3	3.121244482	5	3.254935011	4
14	3.258886509	4	3.122727237	4	3.256160683	3	WSLS'
1	3.256299103	5	3.132024724	3	3.250285117	5
3	3.240387724	6	3.082433491	9	3.236639676	6	WSLS'
21	3.238405281	7	3.083776638	8	3.235193757	7	WSLS'
2	3.215812884	8	3.054822228	14	3.210963579	9	WSLS
16	3.215812884	9	3.054822228	15	3.210963579	10	WSLS
17	3.215547504	10	3.063675088	11	3.21302084	8	TFT'
19	3.213334955	11	3.06115156	12	3.207154339	11	TFT
6	3.197763649	12	3.056192503	13	3.193193311	12	WSLS'
12	3.197073568	13	3.045809911	16	3.178046947	16
20	3.192768288	14	3.020367533	17	3.186697798	13	WSLS'
4	3.191465329	15	3.086214152	7	3.181888575	14
23	3.188569289	16	3.06617612	10	3.180589184	15
7	3.166979223	17	3.001625671	19	3.158844309	17	TFT'
15	3.161225612	18	2.983885545	21	3.151458884	19	WSLS'
11	3.159787981	19	3.004255063	18	3.154021739	18	TFT'
24	3.158548933	20	2.940998137	23	3.101466013	21
10	3.157508886	21	2.988733446	20	3.148909213	20	TFT'
8	3.121529725	22	3.104081314	6	3.096718772	22	HIST
9	3.088360193	23	2.962527525	22	3.030839327	24	STFT
5	3.072941197	24	2.902704555	24	3.063080194	23
average	3.211302164		3.067296974		3.200348035

ほぼセッションベース平均と同じ。

Case5: imperfect private monitoring（尾山ゼミ+神取ゼミの戦略）

自分自身との対戦無しのケース

結果の生データ(csv)は contest5/data
戦略は user_strategies
戦略のオートマトンは contest5/automaton5.pdf



In [89]:

    
# 「相手の」シグナルが協調か攻撃かを（ノイズ付きで）返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24, 
                    Iida_iprm, KatoStrategy, Self_Centered_private, ImPrivStrategy,
                    GrimTrigger, MyStrategy, beeleb, OyamaImperfectPrivateMonitoring, ogawa, yamagishi]
    
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)









    



Start
The object has 34 strategy functions below
--------------------------------------------------
1. kandori.Strategy1
2. kandori.Strategy2
3. kandori.Strategy3
4. kandori.Strategy4
5. kandori.Strategy5
6. kandori.Strategy6
7. kandori.Strategy7
8. kandori.Strategy8
9. kandori.Strategy9
10. kandori.Strategy10
11. kandori.Strategy11
12. kandori.Strategy12
13. kandori.Strategy13
14. kandori.Strategy14
15. kandori.Strategy15
16. kandori.Strategy16
17. kandori.Strategy17
18. kandori.Strategy18
19. kandori.Strategy19
20. kandori.Strategy20
21. kandori.Strategy21
22. kandori.Strategy22
23. kandori.Strategy23
24. kandori.Strategy24
25. Iida_imperfect_private.Iida_iprm
26. kato.KatoStrategy
27. ikegami_imperfect_private.Self_Centered_private
28. mhanami_Imperfect_Private_Strategy.ImPrivStrategy
29. tsuyoshi.GrimTrigger
30. gistfile1.MyStrategy
31. beeleb_Strategy.beeleb
32. oyama.OyamaImperfectPrivateMonitoring
33. ogawa.ogawa
34. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.347  3.703  3.614 ...,  3.45   3.291  3.162]
 [ 3.533  3.783  3.686 ...,  3.388  3.225  3.218]
 [ 3.538  3.777  3.688 ...,  3.365  3.208  3.212]
 ..., 
 [ 3.249  3.207  3.149 ...,  3.558  3.407  3.415]
 [ 3.08   3.169  3.079 ...,  3.506  3.567  3.39 ]
 [ 3.074  3.241  3.2   ...,  3.525  3.48   3.225]]

各ステージゲームを重率1で平均した得点
[[ 2.979  3.652  3.533 ...,  3.091  2.788  2.79 ]
 [ 3.202  3.756  3.641 ...,  3.082  2.566  2.972]
 [ 3.251  3.746  3.637 ...,  3.05   2.572  2.944]
 ..., 
 [ 2.834  2.976  2.886 ...,  3.336  2.909  3.217]
 [ 2.711  3.082  2.966 ...,  3.134  3.136  3.048]
 [ 2.592  2.98   2.919 ...,  3.292  3.008  2.965]]

Summary

Str No.	Strategy name	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	備考
Datetime	2015-12-29-08-02-38
Monitoring type	private
RandomSeed	282
Repeats	1000
Average ts_length	32.856
Number of strategies	34
27	ikegami_imperfect_private.Self_Centered_private	3.366868101	1	3.218934853	2	20%
28	mhanami_Imperfect_Private_Strategy.ImPrivStrategy	3.361933787	2	3.228403533	1	2T2FT
25	Iida_imperfect_private.Iida_iprm	3.324539566	3	3.141753588	7
4	kandori.Strategy4	3.303921511	4	3.182839736	4
18	kandori.Strategy18	3.298573086	5	3.094787504	14	WSLS'
17	kandori.Strategy17	3.289954129	6	3.113827808	10	TFT'
30	gistfile1.MyStrategy	3.28804656	7	3.141913824	6	TFT'
23	kandori.Strategy23	3.28293638	8	3.132734732	8
11	kandori.Strategy11	3.276110459	9	3.116849461	9	TFT'
19	kandori.Strategy19	3.275622813	10	3.102621153	11	TFT
34	yamagishi_impd.yamagishi	3.275622813	11	3.102621153	12	TFT
29	tsuyoshi.GrimTrigger	3.273337259	12	3.096912194	13	TFT'
31	beeleb_Strategy.beeleb	3.268154747	13	3.14303055	5
14	kandori.Strategy14	3.265929641	14	3.074780414	17	WSLS'
32	oyama.OyamaImperfectPrivateMonitoring	3.264771416	15	3.094682769	15	TFT'
1	kandori.Strategy1	3.264351241	16	3.074011909	18
8	kandori.Strategy8	3.261388173	17	3.201616412	3	HIST
33	ogawa.ogawa	3.257102734	18	3.08465953	16
22	kandori.Strategy22	3.247990043	19	3.05550647	19
6	kandori.Strategy6	3.245408946	20	3.055048142	20	WSLS'
21	kandori.Strategy21	3.231870454	21	3.020929117	22	WSLS'
3	kandori.Strategy3	3.221153715	22	3.004112867	25	WSLS'
7	kandori.Strategy7	3.220640855	23	3.032698388	21	TFT'
10	kandori.Strategy10	3.209269086	24	3.017327841	23	TFT'
16	kandori.Strategy16	3.204363061	25	2.98581287	27	WSLS
2	kandori.Strategy2	3.204363061	26	2.98581287	28	WSLS
13	kandori.Strategy13	3.199255907	27	2.993635328	26	CCDDDD
5	kandori.Strategy5	3.17071256	28	2.970803077	30
26	kato.KatoStrategy	3.168234447	29	3.005790866	24
20	kandori.Strategy20	3.156422163	30	2.929961758	32	WSLS'
12	kandori.Strategy12	3.154547863	31	2.957956466	31
15	kandori.Strategy15	3.118463618	32	2.890986873	33	WSLS'
9	kandori.Strategy9	3.113619613	33	2.976513377	29	STFT
24	kandori.Strategy24	3.013624736	34	2.759430635	34
average		3.237620722		3.058509061

戦略別, セッション平均の分布



In [97]:

    
rounds = 1000 * 2
strategies = 34
max_ts = 100

# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(28, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.1, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=15)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()

基本統計量



In [98]:

    
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]

display(statistics.iloc[:, :12])
display(statistics.iloc[:, 12:24])
display(statistics.iloc[:, 24:])

期数による平均利得の変化



In [99]:

    
rounds = 1000 * 2
strategies = 34
max_ts = 100

# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]

    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [27, 28, 18, 13, 9, 8]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[27-1], color='red', linewidth=2, label="27 (20%)")
plt.plot(t_list, average_matrix[28-1], color='blue', linewidth=2, label="28 (2T2FT)")
plt.plot(t_list, average_matrix[19-1], color='magenta', linewidth=2, label="19 (TFT)")
plt.plot(t_list, average_matrix[18-1], color='green', linewidth=2, label="18 (WSLS’)")
plt.plot(t_list, average_matrix[13-1], color='purple', linewidth=2, label="13 (CCDDDD)")
plt.plot(t_list, average_matrix[9-1], color='brown', linewidth=2, label="9 (STFT)")
plt.plot(t_list, average_matrix[8-1], color='orange', linewidth=2, label="8 (HIST)")

plt.legend()
plt.show()

トリム平均

セッション・ベース平均から、期数の短いセッション・長いセッションそれぞれ5%ずつを除外して平均を求める



In [100]:

    
def trim_mean(ts_length, aves, width):
    size = ts_length.size
    hist = {}
    for t in ts_length:
        hist[t] = hist.get(t, 0) + 1
    lower_b = size * (1-width) / 2
    upper_b = size * (1 - (1-width)/2)
    s = 0
    total = 0
    for ts, num in sorted(hist.items()):
        old_s = s
        s += num
        if old_s <= lower_b < s:
            total += (s-lower_b) * aves[ts-1]
        elif old_s <= upper_b < s:
            total += (upper_b-old_s+1) * aves[ts-1]
        elif lower_b <= s <= upper_b:
            total += num * aves[ts-1]
        elif s > upper_b:
            break
    return total / (size * width)

rounds = 1000 * 2
strategies = 34
max_ts = ts_length.max()
    
# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜max期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)
for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

for i in range(strategies):
    print(trim_mean(ts_length, average_matrix[i], 0.9))









    



3.26128605178
3.19971796592
3.21704923649
3.29734163396
3.16464772432
3.24332506865
3.21301979831
3.24588266352
3.06042554808
3.20107577233
3.27201102596
3.13313474672
3.19246937039
3.26435406728
3.10604335029
3.19971796592
3.28866020411
3.2977831816
3.26989546607
3.14820588064
3.22917799939
3.24365673834
3.27869565175
2.95221585188
3.32541774226
3.14646921945
3.36357056738
3.36048783939
3.27022245893
3.28289279676
3.25924746127
3.26086202884
3.25353522365
3.26989546607

Str No.	Average(session based)	Rank(session based)	Average(stage based)	Rank(stage based)	Average(90% trimmed)	Rank(trimmed)	備考
27	3.366868101	1	3.218934853	2	3.363570567	1	20%
28	3.361933787	2	3.228403533	1	3.360487839	2	2T2FT
25	3.324539566	3	3.141753588	7	3.325417742	3
4	3.303921511	4	3.182839736	4	3.297341634	5
18	3.298573086	5	3.094787504	14	3.297783182	4	WSLS'
17	3.289954129	6	3.113827808	10	3.288660204	6	TFT'
30	3.28804656	7	3.141913824	6	3.282892797	7	TFT'
23	3.28293638	8	3.132734732	8	3.278695652	8
11	3.276110459	9	3.116849461	9	3.272011026	9	TFT'
19	3.275622813	10	3.102621153	11	3.269895466	11	TFT
34	3.275622813	11	3.102621153	12	3.269895466	12	TFT
29	3.273337259	12	3.096912194	13	3.270222459	10	TFT'
31	3.268154747	13	3.14303055	5	3.259247461	16
14	3.265929641	14	3.074780414	17	3.264354067	13	WSLS'
32	3.264771416	15	3.094682769	15	3.260862029	15	TFT'
1	3.264351241	16	3.074011909	18	3.261286052	14
8	3.261388173	17	3.201616412	3	3.245882664	18	HIST
33	3.257102734	18	3.08465953	16	3.253535224	17
22	3.247990043	19	3.05550647	19	3.243656738	19
6	3.245408946	20	3.055048142	20	3.243325069	20	WSLS'
21	3.231870454	21	3.020929117	22	3.229177999	21	WSLS'
3	3.221153715	22	3.004112867	25	3.217049236	22	WSLS'
7	3.220640855	23	3.032698388	21	3.213019798	23	TFT'
10	3.209269086	24	3.017327841	23	3.201075772	24	TFT'
16	3.204363061	25	2.98581287	27	3.199717966	26	WSLS
2	3.204363061	26	2.98581287	28	3.199717966	25	WSLS
13	3.199255907	27	2.993635328	26	3.19246937	27	CCDDDD
5	3.17071256	28	2.970803077	30	3.164647724	28
26	3.168234447	29	3.005790866	24	3.146469219	30
20	3.156422163	30	2.929961758	32	3.148205881	29	WSLS'
12	3.154547863	31	2.957956466	31	3.133134747	31
15	3.118463618	32	2.890986873	33	3.10604335	32	WSLS'
9	3.113619613	33	2.976513377	29	3.060425548	33	STFT
24	3.013624736	34	2.759430635	34	2.952215852	34
average	3.237620722		3.058509061		3.228599817

セッション平均とほぼ同じ。

検証

検証1 TFT, WSLS, ALLDの関係

実験4

集計表

実験4では、24戦略の内、TFTに類似した戦略が6、WSLSに類似した戦略が9、ALLDに類似した戦略が1つあった。
結果は1位がWSLSタイプ、2位がALLDタイプの戦略で、全体的にWSLSは高利得、TFTは低利得となった。

スコアテーブルを戦略のタイプ別に集計すると、

タイプ別平均
	WSLS	TFT	ALLD	Other kandori	total average
WSLS	3.485347614	3.1719659	1.597288599	3.241082345	3.246911304
TFT	3.156388186	3.201584794	2.102810448	3.289310268	3.168095626
ALLD	3.568115816	2.766558346	2.393956929	3.474086488	3.287460052
Other kandori	3.281705328	3.259717721	1.61449243	3.243862807	3.194127049
total	3.33867567	3.191729249	1.762598185	3.263774653	3.211302164

となった。神取ゼミのWSLS, TFT, ALLD以外の8戦略（Other kandori）は3戦略にそれほど大きな影響を与えていないことがわかる。したがって、3タイプだけで元実験を近似できている。

一般に、WSLSが多く、ALLDが少ない環境では、WSLSは高い利得を得られる。
特に戦略18は、通常のWSLSに比べてALLDに強く、1位になった要因だと考えられる。

実験5

集計表

実験5では、34戦略の内、TFTに類似した戦略が11, WSLSが9, ALLDが1つであった。
1位は「過去のシグナルのうち20%以上がBならD, それ以外ならC」という戦略、2位は2T2FTであった。全体的にTFTが高利得、ALLDとWSLSは低利得となった。

タイプ別に集計すると、

タイプ別平均
	WSLS	TFT	ALLD	Other kandori	Other oyama	total average
WSLS	3.485347614	3.294184513	1.597288599	3.241082345	2.844703858	3.216283083
TFT	3.217917263	3.418662665	1.999373767	3.280819084	3.198661754	3.258993526
ALLD	3.568115816	2.947953446	2.393956929	3.474086488	2.80950435	3.199255907
Other kandori	3.281705328	3.384906732	1.61449243	3.243862807	2.977608253	3.212434063
Other oyama	3.290522046	3.369423514	2.085169256	3.355591673	3.161811508	3.276979919
total	3.324693738	3.356684553	1.826601514	3.278285245	3.036089469	3.237620722

スコアテーブルを戦略のタイプごとに集計し直すと、実験5で「WSLSの利得 < TFTの利得」となった要因は、TFT同士の対戦の利得が実験4の場合よりも高くなったこと、及び尾山ゼミのWSLS, TFT, ALLDでない残りの5戦略との対戦でTFTがより高い利得を得たことにある。

検証2 「過去全ての履歴の内◯◯%以上BならDを出す戦略」は安定して高い利得を得られるか

集計表

タイプ別平均
	WSLS	TFT	ALLD	Prob	Other kandori	Other oyama	total average
WSLS	3.485347614	3.294184513	1.597288599	2.559611894	3.241082345	2.915976849	3.216283083
TFT	3.217917263	3.418662665	1.999373767	2.99366014	3.280819084	3.249912158	3.258993526
ALLD	3.568115816	2.947953446	2.393956929	2.536744069	3.474086488	2.87769442	3.199255907
Prob	3.553529215	3.278944542	2.286598783	3.318680869	3.513118831	3.178283059	3.366868101
Other kandori	3.281705328	3.384906732	1.61449243	2.727741801	3.243862807	3.040074867	3.212434063
Other oyama	3.224770254	3.392043256	2.034811875	2.927390992	3.316209883	3.206494414	3.254507873
total	3.324693738	3.356684553	1.826601514	2.80452035	3.278285245	3.093981748	3.237620722

実験5のスコアテーブルを再度タイプ別に集計すると、ProbはTFT, ALLDとの対戦でそれなりの利得を得、WSLSに対してはALLDなみに高い利得をえていることがわかる。



In [ ]:

Str No.	1	2	3	4	5	6	7	8	9	10
ranking	8.000	4.000	10.000	2.000	7.000	6.000	3.000	5.000	9.000	1.000
count	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000
mean	3.620	3.726	2.964	3.738	3.656	3.720	3.736	3.726	3.493	3.807
std	0.646	0.581	0.874	0.563	0.766	0.610	0.565	0.581	0.657	0.450
min	1.801	2.000	2.003	2.000	0.829	1.418	2.000	2.000	2.000	2.000
25%	3.228	4.000	2.095	4.000	4.000	4.000	4.000	4.000	2.939	4.000
50%	4.000	4.000	2.767	4.000	4.000	4.000	4.000	4.000	3.865	4.000
75%	4.000	4.000	3.731	4.000	4.000	4.000	4.000	4.000	4.000	4.000
max	4.467	4.000	4.500	4.000	4.000	4.000	4.000	4.000	4.400	4.000

Str No.	1	2	3	4	5	6	7	8	9	10
ranking	5.000	1.000	3.000	2.000	8.000	10.000	9.000	4.000	7.000	6.000
count	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000
mean	3.146	3.284	3.216	3.224	2.948	2.900	2.921	3.170	3.073	3.103
std	1.040	1.031	0.978	0.924	1.275	1.337	1.302	1.025	1.143	1.052
min	0.000	2.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
25%	2.293	2.280	2.333	2.417	1.839	1.785	1.857	2.419	2.188	2.152
50%	3.473	3.098	3.222	3.571	3.537	3.636	3.600	3.556	3.571	3.538
75%	4.000	4.143	4.167	4.000	4.000	4.000	4.000	4.000	4.000	4.000
max	4.900	5.000	4.976	4.976	4.857	4.333	4.103	4.981	4.660	4.800

Str No.	1	2	3	4	5	6	7	8	9	10
ranking	8.000	3.000	10.000	7.000	9.000	2.000	1.000	5.000	6.000	4.000
count	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000	20000.000
mean	3.366	3.481	3.149	3.369	3.280	3.483	3.513	3.420	3.407	3.424
std	0.763	0.682	0.831	0.724	0.877	0.779	0.710	0.712	0.699	0.692
min	1.231	1.200	1.417	1.333	0.571	0.615	0.667	0.571	0.800	1.333
25%	2.818	2.939	2.394	2.667	2.798	3.154	3.111	2.857	2.835	2.900
50%	3.615	4.000	3.045	3.692	3.634	4.000	4.000	3.784	3.667	3.714
75%	4.000	4.000	4.000	4.000	4.000	4.000	4.000	4.000	4.000	4.000
max	4.800	4.400	4.800	4.857	4.800	4.250	4.069	4.800	4.429	4.500

Str No.	1	2	3	4	5	6	7	8	9	10	11	12
ranking	5.000	8.000	6.000	11.000	23.000	10.000	18.000	17.000	24.000	21.000	16.000	20.000
count	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000
mean	3.255	3.239	3.252	3.231	3.098	3.232	3.155	3.159	3.059	3.141	3.186	3.144
std	0.834	0.871	0.838	0.844	0.960	0.931	0.752	0.939	0.810	0.757	0.835	0.700
min	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	1.000	0.000	0.000	0.000
25%	2.700	2.750	2.750	2.687	2.763	2.947	2.600	2.629	2.378	2.600	2.567	2.611
50%	3.551	3.579	3.571	3.515	3.333	3.565	3.176	3.467	2.889	3.150	3.421	3.167
75%	4.000	3.952	3.952	4.000	3.871	4.000	3.889	4.000	3.750	3.864	4.000	3.500
max	4.909	4.909	4.909	4.200	4.250	4.769	4.667	4.099	5.000	4.667	4.300	4.750

Str No.	13	14	15	16	17	18	19	20	21	22	23	24
ranking	4.000	3.000	19.000	9.000	12.000	1.000	14.000	15.000	7.000	2.000	13.000	22.000
count	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000	48000.000
mean	3.272	3.274	3.152	3.239	3.229	3.359	3.215	3.196	3.247	3.283	3.229	3.130
std	0.725	0.836	0.855	0.871	0.740	0.732	0.738	0.863	0.861	0.886	0.954	0.915
min	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	1.000
25%	2.655	2.923	2.700	2.750	2.671	2.949	2.620	2.727	2.800	3.000	2.947	2.538
50%	3.429	3.562	3.360	3.579	3.333	3.571	3.333	3.481	3.571	3.618	3.577	3.250
75%	3.889	4.000	3.831	3.952	4.000	4.000	4.000	3.889	3.955	3.955	4.000	3.706
max	4.875	4.923	4.917	4.909	4.400	4.923	4.500	4.900	4.889	4.909	4.143	5.000

Str No.	1	2	3	4	5	6	7	8	9	10	11	12
ranking	16.000	26.000	22.000	4.000	28.000	20.000	23.000	17.000	33.000	24.000	9.000	31.000
count	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000
mean	3.264	3.204	3.221	3.304	3.171	3.245	3.221	3.261	3.114	3.209	3.276	3.155
std	0.821	0.895	0.864	0.835	0.964	0.935	0.759	0.900	0.846	0.766	0.810	0.752
min	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	1.000	0.000	0.000	0.000
25%	2.681	2.667	2.667	2.769	2.750	2.909	2.654	2.743	2.400	2.643	2.695	2.545
50%	3.542	3.542	3.541	3.636	3.404	3.593	3.286	3.619	3.000	3.259	3.551	3.136
75%	4.000	4.000	4.000	4.000	4.000	4.000	4.000	4.000	3.870	4.000	4.000	3.609
max	4.923	4.917	4.909	4.167	4.250	4.778	4.750	4.111	5.000	4.750	4.333	4.818

Str No.	13	14	15	16	17	18	19	20	21	22	23	24
ranking	27.000	14.000	32.000	25.000	6.000	5.000	10.000	30.000	21.000	19.000	8.000	34.000
count	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000
mean	3.199	3.266	3.118	3.204	3.290	3.299	3.276	3.156	3.232	3.248	3.283	3.014
std	0.742	0.854	0.891	0.895	0.737	0.760	0.734	0.892	0.880	0.919	0.940	1.000
min	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	1.000
25%	2.540	2.860	2.600	2.667	2.729	2.750	2.685	2.625	2.739	2.900	2.974	2.333
50%	3.300	3.564	3.312	3.542	3.462	3.520	3.439	3.415	3.556	3.600	3.656	3.105
75%	3.800	4.000	3.846	4.000	4.000	4.000	4.000	3.910	4.000	4.000	4.000	3.667
max	4.895	4.917	4.909	4.917	4.400	4.923	4.667	4.917	4.923	4.933	4.200	5.000

Str No.	25	26	27	28	29	30	31	32	33	34
ranking	3.000	29.000	1.000	2.000	12.000	7.000	13.000	15.000	18.000	11.000
count	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000	68000.000
mean	3.325	3.168	3.367	3.362	3.273	3.288	3.268	3.265	3.257	3.276
std	0.745	0.767	0.695	0.698	0.852	0.875	0.854	0.790	0.775	0.734
min	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
25%	2.888	2.537	2.812	2.902	2.875	2.789	2.650	2.705	2.697	2.685
50%	3.450	3.188	3.538	3.538	3.566	3.636	3.622	3.494	3.429	3.439
75%	4.000	3.758	4.000	4.000	4.000	4.000	4.000	4.000	4.000	4.000
max	4.900	4.800	4.923	4.400	4.923	4.250	4.100	4.900	4.500	4.667