囚人のジレンマゲームの実験3 Appendix1

各戦略が自分自身と対戦することを許す自分自身との対戦無しのケース

実験の概要: README.md

  • 実験1: 完全観測
  • 実験2: 不完全公的観測
  • 実験3: 不完全私的観測(尾山ゼミの戦略)
  • 実験4: 不完全私的観測(神取ゼミの戦略)
  • 実験5: 不完全私的観測(神取, 尾山ゼミの戦略)

利得表

<table align="center", style="text-align:center;"> 自分の行動, 相手の行動 行動0(active) 行動1(inactive) 行動0(active) 4, 4 0, 5 行動1(inactive) 5, 0 2, 2 </table>


In [21]:
#-*- encoding: utf-8 -*-
%matplotlib inline
from IPython.display import display, HTML
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import pandas as pd
import scipy.stats as stats
np.set_printoptions(precision=3)
np.set_printoptions(linewidth=300)
pd.set_option('display.max_columns', 30)
pd.set_option('display.width', 400)
pd.set_option('display.precision', 4)
import sys
sys.path.append('./user_strategies')
# 日本語対応
mpl.rcParams['font.family'] = 'Osaka'
plt.rcParams['font.size'] = 14
import play as pl
from Iida_perfect_monitoring import Iida_pm
from Iida_imperfect_public import Iida_ipm
from Iida_imperfect_private import Iida_iprm
from kato import KatoStrategy
from ikegami_perfect import Self_Centered_perfect
from ikegami_imperfect_public import Self_Centered_public
from ikegami_imperfect_private import Self_Centered_private
from mhanami_Public_Strategy import PubStrategy
from mhanami_Imperfect_Public_Strategy import ImPubStrategy
from mhanami_Imperfect_Private_Strategy import ImPrivStrategy
from tsuyoshi import GrimTrigger
from gistfile1 import MyStrategy
from beeleb_Strategy import beeleb
from oyama import OyamaPerfectMonitoring, OyamaImperfectPublicMonitoring, OyamaImperfectPrivateMonitoring
from ogawa import ogawa
from yamagishi_impd import yamagishi
from kandori import *

Test

各戦略のテスト


In [6]:
import unittest

class TestStrategies(unittest.TestCase):
    def setUp(self):
        self.Strategies = [Iida_pm, Iida_ipm, Iida_iprm, KatoStrategy, Self_Centered_perfect, \
                          Self_Centered_public, Self_Centered_private, PubStrategy, ImPubStrategy, ImPrivStrategy, \
                          MyStrategy, beeleb, OyamaPerfectMonitoring, \
                           OyamaImperfectPublicMonitoring, OyamaImperfectPrivateMonitoring, \
                          ogawa, yamagishi, GrimTrigger, Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24, ] # ここに自作のclassを入れる
        self.case1 = "Signal is empty(period 1)"
        self.case2 = [0, 1]
        self.case3 = [1, 0]
        self.case4 = [0, 1, 0, 1, 0, 0, 1]
        self.seed = 222
        self.RandomState = np.random.RandomState(self.seed)


    # case1を引数に渡してテスト
    def test1(self):
        print("testcase:", self.case1)
        for Strategy in self.Strategies:
            rst = Strategy(self.RandomState).play()
            self.assertIsNotNone(rst, Strategy.__module__)
            self.assertIn(rst, (0, 1), Strategy.__module__)

    # case2を引数に渡してテスト
    def test2(self):
        print("testcase:", self.case2)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case2:
                rst = S.play()
                S.get_signal(signal)
                self.assertIsNotNone(rst, Strategy.__module__)
                self.assertIn(rst, (0, 1), Strategy.__module__)

    # case3を引数に渡してテスト
    def test3(self):
        print("testcase:", self.case3)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case3:
                rst = S.play()
                S.get_signal(signal)
            
            self.assertIsNotNone(rst, S.__module__)
            self.assertIn(rst, (0, 1), S.__module__)

    # case4を引数に渡してテスト
    def test4(self):
        print("testcase:", self.case4)
        for Strategy in self.Strategies:
            S = Strategy(self.RandomState)
            for signal in self.case4:
                rst = S.play()
                S.get_signal(signal)
                self.assertIsNotNone(rst, S.__module__)
                self.assertIn(rst, (0, 1), S.__module__)

In [7]:
suite = unittest.TestLoader().loadTestsFromTestCase(TestStrategies)
unittest.TextTestRunner().run(suite)


....
testcase: Signal is empty(period 1)
testcase: [0, 1]
testcase: [1, 0]
testcase: [0, 1, 0, 1, 0, 0, 1]
----------------------------------------------------------------------
Ran 4 tests in 0.004s

OK
Out[7]:
<unittest.runner.TextTestResult run=4 errors=0 failures=0>

Test: OK

実験のセットアップ


In [82]:
payoff = np.array([[4, 0], [5, 2]])
seed = 282
rs = np.random.RandomState(seed)
discount_v = 0.97
repeat = 1000
ts_length = rs.geometric(p=1-discount_v, size=1000)

Case1: perfect monitoring

自分自身との対戦無しのケース

結果の生データ(csv)は contest1/data
戦略はuser_strategies
戦略のオートマトンはcontest1/automaton1.pdf


In [9]:
strategies = [Iida_pm, PubStrategy, KatoStrategy, Self_Centered_perfect,
                       GrimTrigger, MyStrategy, beeleb, OyamaPerfectMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=None, ts_length=ts_length, repeat=1000)
game.play(mtype="perfect", random_seed=seed, record=False)


Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_perfect_monitoring.Iida_pm
2. mhanami_Public_Strategy.PubStrategy
3. kato.KatoStrategy
4. ikegami_perfect.Self_Centered_perfect
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaPerfectMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.456  3.396  2.514  3.952  3.807  4.146  4.146  3.394  3.588  3.803]
 [ 3.519  4.     2.428  4.     4.     4.     4.     4.     3.315  4.   ]
 [ 2.912  2.234  2.229  2.292  3.408  3.973  3.814  2.234  3.641  2.906]
 [ 3.463  4.     2.46   4.     4.     4.     4.     4.     3.459  4.   ]
 [ 3.292  4.     1.893  4.     4.     4.     4.     4.     3.374  4.   ]
 [ 3.415  4.     2.31   4.     4.     4.     4.     4.     3.479  4.   ]
 [ 3.415  4.     2.416  4.     4.     4.     4.     4.     3.534  4.   ]
 [ 3.518  4.     2.428  4.     4.     4.     4.     4.     3.315  4.   ]
 [ 3.257  3.254  2.501  3.904  3.792  3.897  3.815  3.254  3.612  3.643]
 [ 3.784  4.     2.69   4.     4.     4.     4.     4.     3.595  4.   ]]

各ステージゲームを重率1で平均した得点
[[ 2.93   2.794  2.198  3.695  3.627  4.285  4.285  2.788  3.107  3.702]
 [ 3.055  4.     2.17   4.     4.     4.     4.     4.     2.82   4.   ]
 [ 2.491  2.037  2.061  2.066  3.418  3.595  3.211  2.037  3.044  2.524]
 [ 3.009  4.     2.194  4.     4.     4.     4.     4.     2.926  4.   ]
 [ 2.569  4.     1.402  4.     4.     4.     4.     4.     2.708  4.   ]
 [ 2.859  4.     1.93   4.     4.     4.     4.     4.     2.908  4.   ]
 [ 2.859  4.     2.186  4.     4.     4.     4.     4.     3.072  4.   ]
 [ 3.051  4.     2.17   4.     4.     4.     4.     4.     2.82   4.   ]
 [ 2.805  2.767  2.276  3.618  3.684  3.666  3.418  2.767  3.182  3.217]
 [ 3.686  4.     2.418  4.     4.     4.     4.     4.     3.157  4.   ]]

Summary

Datetime 2015-12-28-05-37-48
Monitoring type perfect
RandomSeed 282
Repeats 1000
Average ts_length 32.856
Number of strategies 10
Str No. Strategy name Average(session based) Rank(session based) Average(stage based) Rank(stage based) 備考
10 yamagishi_impd.yamagishi 3.8069881 1 3.726098734 1 TFT
4 ikegami_perfect.Self_Centered_perfect 3.738247097 2 3.612840881 2 30%
7 beeleb_Strategy.beeleb 3.736419018 3 3.611766496 3
2 mhanami_Public_Strategy.PubStrategy 3.72617118 4 3.604507548 4 TFT'
8 oyama.OyamaPerfectMonitoring 3.726026754 5 3.604087533 5 GT
6 gistfile1.MyStrategy 3.720316042 6 3.569763514 6 TFT'
5 tsuyoshi.GrimTrigger 3.655879793 7 3.467905405 7 TFT'
1 Iida_perfect_monitoring.Iida_pm 3.620227933 8 3.341106647 8
9 ogawa.ogawa 3.492903964 9 3.139997261 9
3 kato.KatoStrategy 2.964379561 10 2.648450816 10
average3.6187559443.432652484

戦略1と5のセッションベース順位が入れ替わったが、それ以外に大きな変動はなし。

戦略別セッション平均の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%


In [41]:
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest1/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)
for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

# boxplot
averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 8))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)
plt.show()


基本統計量


In [39]:
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]
display(statistics)


Str No. 1 2 3 4 5 6 7 8 9 10
ranking 8.000 4.000 10.000 2.000 7.000 6.000 3.000 5.000 9.000 1.000
count 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000
mean 3.620 3.726 2.964 3.738 3.656 3.720 3.736 3.726 3.493 3.807
std 0.646 0.581 0.874 0.563 0.766 0.610 0.565 0.581 0.657 0.450
min 1.801 2.000 2.003 2.000 0.829 1.418 2.000 2.000 2.000 2.000
25% 3.228 4.000 2.095 4.000 4.000 4.000 4.000 4.000 2.939 4.000
50% 4.000 4.000 2.767 4.000 4.000 4.000 4.000 4.000 3.865 4.000
75% 4.000 4.000 3.731 4.000 4.000 4.000 4.000 4.000 4.000 4.000
max 4.467 4.000 4.500 4.000 4.000 4.000 4.000 4.000 4.400 4.000

大きくは変わらず。

期数による平均利得の変化


In [40]:
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest1/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [10, 8, 4]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[10-1], color='red', linewidth=2, label="10 (TFT)")
plt.plot(t_list, average_matrix[8-1], color='blue', linewidth=2, label="8 (GrimTrigger)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (30%)")
plt.legend()
plt.show()


大きく変わらず。

Case2: imperfect public monitoring

自分自身との対戦無しのケース

結果の生データ(csv)は contest2/data
戦略はuser_strategies
戦略のオートマトンはcontest2/automaton2.pdf


In [32]:
# プロジェクトが成功か失敗かを返す
def public_signal(actions, random_state):
    prob = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return 0 if prob < 0.9 else 1
    elif (actions[0] == 0 and actions[1] == 1) or (actions[0] == 1 and actions[1] == 0):
        return 0 if prob < 0.5 else 1
    elif actions[0] == 1 and actions[1] == 1:
        return 0 if prob < 0.2 else 1
    else:
        raise ValueError

strategies = [Iida_ipm, ImPubStrategy, KatoStrategy, Self_Centered_public, GrimTrigger,
              MyStrategy, beeleb, OyamaImperfectPublicMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=public_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="public", random_seed=seed, record=False)


Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_imperfect_public.Iida_ipm
2. mhanami_Imperfect_Public_Strategy.ImPubStrategy
3. kato.KatoStrategy
4. ikegami_imperfect_public.Self_Centered_public
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaImperfectPublicMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.078  1.283  2.175  3.073  3.767  4.099  4.08   3.118  3.46   3.329]
 [ 3.076  2.     2.473  2.593  4.045  4.412  4.283  3.088  3.678  3.19 ]
 [ 3.005  1.684  2.452  2.824  3.821  4.251  4.185  3.113  3.621  3.202]
 [ 3.117  1.604  2.296  3.396  3.749  4.046  4.014  3.225  3.423  3.369]
 [ 2.743  0.636  1.742  3.017  3.613  3.999  4.062  2.927  3.319  3.42 ]
 [ 2.636  0.392  1.661  3.114  3.576  3.967  3.99   2.912  3.296  3.455]
 [ 2.621  0.478  1.753  3.128  3.539  3.969  3.995  2.954  3.342  3.43 ]
 [ 3.107  1.275  2.253  3.175  3.745  4.071  3.997  3.226  3.511  3.344]
 [ 2.892  0.881  2.005  3.175  3.668  4.049  4.002  3.126  3.542  3.392]
 [ 3.002  1.207  2.043  3.103  3.737  3.959  3.998  3.133  3.362  3.49 ]]

各ステージゲームを重率1で平均した得点
[[ 2.677  1.493  1.927  2.67   3.716  4.148  4.027  2.523  2.939  3.105]
 [ 2.761  2.     2.166  2.2    3.898  4.285  3.929  2.488  3.087  2.96 ]
 [ 2.725  1.89   2.229  2.427  3.791  4.238  3.992  2.568  3.125  2.99 ]
 [ 2.759  1.867  2.126  3.291  3.673  4.053  3.943  2.714  2.892  3.194]
 [ 2.177  0.735  1.311  2.705  3.514  3.99   4.053  2.206  2.618  3.303]
 [ 2.009  0.477  1.185  2.87   3.463  3.963  3.983  2.179  2.533  3.352]
 [ 2.058  0.714  1.396  3.039  3.435  3.963  3.993  2.304  2.652  3.346]
 [ 2.817  1.675  2.14   2.97   3.746  4.126  3.923  2.769  3.064  3.155]
 [ 2.571  1.275  1.866  2.931  3.656  4.111  3.942  2.646  3.11   3.2  ]
 [ 2.577  1.36   1.749  2.841  3.684  3.936  3.939  2.544  2.789  3.384]]

Summary

Datetime 2015-12-28-05-40-50
Monitoring type public
RandomSeed 282
Repeats 1000
Average ts_length 32.856
Number of strategies 10
Str No. Strategy name Average(session based) Rank(session based) Average(stage based) Rank(stage based) 備考
2 mhanami_Imperfect_Public_Strategy.ImPubStrategy 3.283855312 1 2.977218773 4 ALLD
4 ikegami_imperfect_public.Self_Centered_public 3.223912647 2 3.051190041 1 25%
3 kato.KatoStrategy 3.215843985 3 2.997391648 3
8 oyama.OyamaImperfectPublicMonitoring 3.17047641 4 3.038606343 2 GT'
1 Iida_imperfect_public.Iida_ipm 3.146127068 5 2.922405345 6
10 yamagishi_impd.yamagishi 3.103362463 6 2.8803841 7 TFT
9 ogawa.ogawa 3.073304517 7 2.930679328 5
5 tsuyoshi.GrimTrigger 2.947823872 8 2.661101473 9 TFT'
7 beeleb_Strategy.beeleb 2.920922764 9 2.689937911 8
6 gistfile1.MyStrategy 2.900052583 10 2.601404614 10 TFT'
average3.0985681622.875031958

戦略2(ALLD)と戦略3(定期的にDを出す戦略)が上位となった。

戦略別セッション平均の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%


In [44]:
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest2/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 8))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()


基本統計量


In [45]:
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]
display(statistics)


Str No. 1 2 3 4 5 6 7 8 9 10
ranking 5.000 1.000 3.000 2.000 8.000 10.000 9.000 4.000 7.000 6.000
count 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000
mean 3.146 3.284 3.216 3.224 2.948 2.900 2.921 3.170 3.073 3.103
std 1.040 1.031 0.978 0.924 1.275 1.337 1.302 1.025 1.143 1.052
min 0.000 2.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
25% 2.293 2.280 2.333 2.417 1.839 1.785 1.857 2.419 2.188 2.152
50% 3.473 3.098 3.222 3.571 3.537 3.636 3.600 3.556 3.571 3.538
75% 4.000 4.143 4.167 4.000 4.000 4.000 4.000 4.000 4.000 4.000
max 4.900 5.000 4.976 4.976 4.857 4.333 4.103 4.981 4.660 4.800

実験1とは異なり, 分散と順位の間に明確な関係は見られない。

期数による平均利得の変化


In [46]:
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest2/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [2, 8, 4, 10]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[2-1], color='red', linewidth=2, label="2 (ALLD)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (25%)")
plt.plot(t_list, average_matrix[8-1], color='blue', linewidth=2, label="8 (GT’)")
plt.plot(t_list, average_matrix[10-1], color='orange', linewidth=2, label="10 (TFT)")
plt.legend()
plt.show()


上位の戦略は、期数の短長にかかわらず、安定した平均利得をえている。
ALLDは特に短い期数のセッションでの平均利得が大きく、1位になった要因だと考えられる。

Case3: imperfect private monitoring(尾山ゼミの戦略のみ)

自分自身との対戦無しのケース

結果の生データ(csv)は contest3/data
戦略はuser_strategies
戦略のオートマトンはcontest3/automaton3.pdf


In [47]:
# 「相手の」シグナルが協調か攻撃かを(ノイズ付きで)返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Iida_iprm, ImPrivStrategy, KatoStrategy, Self_Centered_private, GrimTrigger,
              MyStrategy, beeleb, OyamaImperfectPrivateMonitoring, ogawa, yamagishi]
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)


Start
The object has 10 strategy functions below
--------------------------------------------------
1. Iida_imperfect_private.Iida_iprm
2. mhanami_Imperfect_Private_Strategy.ImPrivStrategy
3. kato.KatoStrategy
4. ikegami_imperfect_private.Self_Centered_private
5. tsuyoshi.GrimTrigger
6. gistfile1.MyStrategy
7. beeleb_Strategy.beeleb
8. oyama.OyamaImperfectPrivateMonitoring
9. ogawa.ogawa
10. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.155  3.476  2.318  2.954  3.56   4.004  3.984  3.414  3.465  3.333]
 [ 3.241  3.803  2.406  3.338  3.626  3.974  3.977  3.582  3.507  3.36 ]
 [ 2.799  3.373  2.318  2.264  3.404  3.963  3.823  3.155  3.538  2.849]
 [ 3.288  3.475  2.398  3.319  3.68   3.84   3.76   3.472  3.267  3.19 ]
 [ 2.922  3.555  1.92   2.801  3.597  3.904  3.823  3.516  3.283  3.479]
 [ 3.11   3.934  2.205  3.255  3.567  3.986  3.998  3.687  3.429  3.656]
 [ 3.126  3.935  2.336  3.402  3.487  3.985  3.999  3.674  3.503  3.679]
 [ 3.227  3.634  2.375  3.249  3.524  3.924  3.89   3.558  3.407  3.415]
 [ 3.122  3.688  2.42   3.09   3.543  3.916  3.832  3.506  3.567  3.39 ]
 [ 3.384  3.422  2.478  3.086  3.656  3.96   4.022  3.525  3.48   3.225]]

各ステージゲームを重率1で平均した得点
[[ 2.759  3.172  1.946  2.491  3.394  4.003  3.886  3.075  2.939  3.125]
 [ 2.899  3.625  2.21   3.203  3.526  3.959  3.964  3.363  3.047  3.141]
 [ 2.63   2.803  2.123  2.049  3.392  3.699  3.253  2.733  2.991  2.542]
 [ 2.989  3.268  2.164  3.172  3.554  3.772  3.596  3.183  2.792  2.924]
 [ 2.417  3.345  1.457  2.285  3.479  3.862  3.71   3.313  2.621  3.345]
 [ 2.684  3.902  1.785  3.092  3.439  3.982  3.999  3.609  2.814  3.597]
 [ 2.759  3.911  2.113  3.365  3.336  3.982  3.999  3.535  3.017  3.629]
 [ 2.895  3.399  2.115  3.043  3.365  3.891  3.77   3.336  2.909  3.217]
 [ 2.817  3.267  2.2    2.719  3.452  3.752  3.471  3.134  3.136  3.048]
 [ 3.193  3.172  2.204  2.795  3.543  3.948  3.996  3.292  3.008  2.965]]

Summary

Datetime 2015-12-28-05-44-20
Monitoring type private
RandomSeed 282
Repeats 1000
Average ts_length 32.856
Number of strategies 10
Str No. Strategy name Average(session based) Rank(session based) Average(stage based) Rank(stage based) 備考
7 beeleb_Strategy.beeleb 3.512547784 1 3.364516374 1
6 gistfile1.MyStrategy 3.482755316 2 3.29020879 3 TFT'
2 mhanami_Imperfect_Private_Strategy.ImPrivStrategy 3.481391663 3 3.293705868 2 2T2FT
10 yamagishi_impd.yamagishi 3.42390276 4 3.211745191 4 TFT
8 oyama.OyamaImperfectPrivateMonitoring 3.420404393 5 3.193991965 5 TFT'
9 ogawa.ogawa 3.407284046 6 3.099491721 7
4 ikegami_imperfect_private.Self_Centered_private 3.368936793 7 3.14127861 6 20%
1 Iida_imperfect_private.Iida_iprm 3.366075611 8 3.079022096 8
5 tsuyoshi.GrimTrigger 3.28006289 9 2.983293767 9 TFT'
3 kato.KatoStrategy 3.148614675 10 2.821551619 10
average3.3891975933.1478806

戦略別, セッション平均利得の分布

箱ひげ図。赤い線: 中央値, 青い長方形: 25%〜75%


In [48]:
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest3/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(20, 8))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.4, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()


基本統計量


In [49]:
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]
display(statistics)


Str No. 1 2 3 4 5 6 7 8 9 10
ranking 8.000 3.000 10.000 7.000 9.000 2.000 1.000 5.000 6.000 4.000
count 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000 20000.000
mean 3.366 3.481 3.149 3.369 3.280 3.483 3.513 3.420 3.407 3.424
std 0.763 0.682 0.831 0.724 0.877 0.779 0.710 0.712 0.699 0.692
min 1.231 1.200 1.417 1.333 0.571 0.615 0.667 0.571 0.800 1.333
25% 2.818 2.939 2.394 2.667 2.798 3.154 3.111 2.857 2.835 2.900
50% 3.615 4.000 3.045 3.692 3.634 4.000 4.000 3.784 3.667 3.714
75% 4.000 4.000 4.000 4.000 4.000 4.000 4.000 4.000 4.000 4.000
max 4.800 4.400 4.800 4.857 4.800 4.250 4.069 4.800 4.429 4.500

期数による平均利得の変化


In [50]:
rounds = 1000 * 2
strategies = 10
max_ts = 100

# 読み込み
df = pd.read_csv('./contest3/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [2, 7, 4, 10]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[7-1], color='red', linewidth=2, label="7")
plt.plot(t_list, average_matrix[10-1], color='orange', linewidth=2, label="10 (TFT)")
plt.plot(t_list, average_matrix[2-1], color='blue', linewidth=2, label="2 (2T2FT)")
plt.plot(t_list, average_matrix[4-1], color='green', linewidth=2, label="4 (20%)")
plt.legend()
plt.show()


期数が長くなるに従って、協調がしづらくなっていることがわかる。TFT同士の対戦では、このようなことが一般に起こる(後述)

Case4: imperfect private monitoring(神取ゼミの戦略のみ)

自分自身との対戦無しのケース

結果の生データ(csv)は contest4/data
戦略は user_strategies
戦略のオートマトンは contest4/automaton4.pdf


In [51]:
# 「相手の」シグナルが協調か攻撃かを(ノイズ付きで)返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24]
    
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)


Start
The object has 24 strategy functions below
--------------------------------------------------
1. kandori.Strategy1
2. kandori.Strategy2
3. kandori.Strategy3
4. kandori.Strategy4
5. kandori.Strategy5
6. kandori.Strategy6
7. kandori.Strategy7
8. kandori.Strategy8
9. kandori.Strategy9
10. kandori.Strategy10
11. kandori.Strategy11
12. kandori.Strategy12
13. kandori.Strategy13
14. kandori.Strategy14
15. kandori.Strategy15
16. kandori.Strategy16
17. kandori.Strategy17
18. kandori.Strategy18
19. kandori.Strategy19
20. kandori.Strategy20
21. kandori.Strategy21
22. kandori.Strategy22
23. kandori.Strategy23
24. kandori.Strategy24
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.347  3.703  3.614  3.793  3.834  3.909  3.041  3.817  2.363  3.03   3.513  1.965  1.682  3.667  3.171  3.703  3.388  2.67   3.162  3.396  3.69   3.774  4.     2.011]
 [ 3.533  3.783  3.686  3.59   3.66   3.935  3.075  3.598  2.473  3.066  3.64   1.645  1.577  3.718  3.124  3.783  3.335  2.484  3.218  3.414  3.766  3.799  3.972  1.873]
 [ 3.538  3.777  3.688  3.617  3.688  3.946  3.075  3.596  2.419  3.059  3.613  1.707  1.651  3.719  3.17   3.777  3.29   2.652  3.212  3.436  3.767  3.81   3.981  2.027]
 [ 3.156  3.161  3.161  3.997  3.636  3.748  3.43   3.981  3.177  3.379  3.895  1.902  1.671  3.501  2.671  3.161  3.768  3.17   3.654  2.874  3.272  3.474  3.983  1.577]
 [ 2.397  3.332  3.203  4.065  3.798  3.754  3.314  3.91   3.007  3.278  3.836  1.875  1.241  3.365  2.747  3.332  3.615  1.892  3.434  3.025  3.341  3.382  4.053  1.28 ]
 [ 3.076  3.599  3.498  3.921  3.649  3.908  3.346  3.678  3.113  3.292  3.825  1.911  1.284  3.633  2.869  3.599  3.596  2.328  3.575  3.174  3.609  3.627  3.981  1.367]
 [ 2.855  3.146  3.079  4.021  3.748  3.782  3.067  3.957  2.502  3.05   3.715  2.044  2.073  3.458  2.948  3.146  3.379  2.808  3.148  3.038  3.239  3.644  3.998  2.063]
 [ 3.17   3.096  3.028  3.991  3.655  3.627  3.416  3.983  3.187  3.363  3.871  2.018  1.535  3.278  2.69   3.096  3.755  2.682  3.656  2.858  3.146  3.199  3.982  1.496]
 [ 2.515  2.761  2.66   4.073  3.802  3.938  2.87   4.078  2.517  2.874  3.702  2.125  2.278  3.435  2.886  2.761  3.426  2.341  2.798  2.82   2.903  3.554  4.064  2.369]
 [ 2.839  3.135  3.062  4.027  3.773  3.764  3.067  3.948  2.503  3.048  3.712  2.03   2.052  3.427  2.95   3.135  3.364  2.783  3.133  3.034  3.21   3.611  4.006  2.057]
 [ 3.131  3.1    3.048  4.009  3.564  3.819  3.334  3.978  2.997  3.285  3.8    1.96   1.972  3.539  2.716  3.1    3.626  2.826  3.497  2.862  3.211  3.626  3.952  1.525]
 [ 2.889  3.347  3.242  4.056  4.07   3.471  2.785  3.801  2.218  2.816  3.361  2.248  2.359  3.25   3.326  3.347  2.865  2.781  2.684  3.322  3.314  3.275  4.07   2.884]
 [ 3.448  3.601  3.491  3.479  4.124  4.059  2.846  3.682  2.264  2.875  3.028  2.24   2.394  3.655  3.518  3.601  2.865  2.989  2.722  3.574  3.624  3.945  4.124  2.75 ]
 [ 3.321  3.687  3.589  3.825  3.693  3.929  3.233  3.609  2.86   3.186  3.736  1.889  1.548  3.68   3.065  3.687  3.476  2.574  3.447  3.305  3.674  3.738  3.987  1.896]
 [ 3.111  3.578  3.428  3.704  3.885  3.891  3.034  3.609  2.52   3.019  3.579  1.761  1.606  3.46   3.125  3.578  3.206  2.153  3.079  3.327  3.556  3.653  4.022  1.952]
 [ 3.533  3.783  3.686  3.59   3.66   3.935  3.075  3.598  2.473  3.066  3.64   1.645  1.577  3.718  3.124  3.783  3.335  2.484  3.218  3.414  3.766  3.799  3.972  1.873]
 [ 3.256  3.248  3.184  3.962  3.682  3.705  3.235  3.963  2.856  3.199  3.706  2.159  2.08   3.499  2.908  3.248  3.513  2.986  3.368  3.038  3.334  3.401  3.978  1.96 ]
 [ 3.489  3.69   3.605  3.86   3.939  4.001  3.026  3.584  2.284  3.027  3.462  1.981  1.985  3.73   3.403  3.69   3.215  3.508  3.104  3.525  3.697  3.923  4.047  2.865]
 [ 3.074  3.241  3.2    3.987  3.631  3.829  3.091  3.982  2.46   3.07   3.695  2.09   2.163  3.597  2.941  3.241  3.425  2.97   3.225  3.057  3.349  3.791  3.942  2.081]
 [ 3.324  3.696  3.58   3.624  3.787  3.921  3.059  3.583  2.5    3.041  3.604  1.713  1.584  3.593  3.155  3.696  3.242  2.256  3.129  3.391  3.684  3.734  3.999  1.929]
 [ 3.484  3.764  3.677  3.672  3.673  3.939  3.146  3.612  2.557  3.108  3.678  1.716  1.564  3.705  3.123  3.764  3.401  2.551  3.3    3.397  3.751  3.788  3.981  1.885]
 [ 3.454  3.765  3.634  3.868  3.657  3.927  3.298  3.547  2.947  3.249  3.761  1.905  1.348  3.64   3.055  3.765  3.423  2.26   3.552  3.372  3.734  3.795  3.972  1.826]
 [ 2.873  3.548  3.442  3.991  3.639  3.835  3.426  3.973  3.175  3.371  3.888  1.875  1.24   3.584  2.747  3.548  3.765  2.303  3.628  3.07   3.558  3.578  3.985  1.279]
 [ 3.314  3.652  3.358  3.342  4.149  4.069  2.751  3.41   2.316  2.774  3.377  1.67   1.84   3.346  3.428  3.652  2.832  1.753  2.674  3.523  3.578  3.689  4.149  2.874]]

各ステージゲームを重率1で平均した得点
[[ 2.979  3.652  3.533  3.575  3.9    3.908  2.762  3.428  2.42   2.771  3.176  1.852  1.442  3.597  3.189  3.652  3.017  2.232  2.79   3.345  3.64   3.775  4.019  2.362]
 [ 3.202  3.756  3.641  3.351  3.587  3.915  2.869  2.897  2.652  2.869  3.491  1.348  1.173  3.666  3.065  3.756  3.047  1.9    2.972  3.362  3.743  3.775  3.963  2.017]
 [ 3.251  3.746  3.637  3.379  3.625  3.921  2.86   2.902  2.592  2.848  3.44   1.426  1.272  3.655  3.091  3.746  2.983  2.172  2.944  3.366  3.729  3.785  3.97   2.193]
 [ 2.667  2.946  2.92   3.995  3.56   3.621  3.355  3.903  3.456  3.295  3.856  1.746  1.419  3.367  2.571  2.946  3.662  3.024  3.592  2.695  3.089  3.417  3.979  1.897]
 [ 1.615  3.21   3.02   4.074  3.755  3.645  3.211  3.485  3.225  3.174  3.752  1.713  0.798  3.191  2.63   3.21   3.45   1.169  3.331  2.887  3.205  3.28   4.061  1.428]
 [ 2.482  3.561  3.427  3.874  3.575  3.873  3.234  3.101  3.379  3.176  3.744  1.737  0.85   3.573  2.785  3.561  3.357  1.609  3.503  3.09   3.57   3.598  3.974  1.532]
 [ 2.354  2.903  2.823  3.986  3.685  3.693  2.861  3.613  2.683  2.854  3.562  1.878  1.806  3.31   2.813  2.903  3.121  2.458  2.911  2.846  3.026  3.593  3.985  2.232]
 [ 2.819  3.076  2.971  3.963  3.644  3.553  3.231  3.914  3.345  3.182  3.748  1.963  1.553  3.186  2.86   3.076  3.549  2.476  3.49   2.929  3.109  3.285  3.982  2.404]
 [ 2.205  2.711  2.606  3.957  3.627  3.82   2.762  3.756  2.659  2.756  3.483  1.961  1.952  3.398  2.772  2.711  3.127  2.223  2.726  2.745  2.879  3.667  3.966  2.327]
 [ 2.318  2.902  2.812  3.99   3.713  3.667  2.861  3.575  2.677  2.855  3.563  1.857  1.777  3.27   2.811  2.902  3.098  2.441  2.903  2.84   2.987  3.555  3.996  2.226]
 [ 2.687  2.805  2.72   4.013  3.448  3.727  3.201  3.839  3.19   3.153  3.71   1.815  1.752  3.383  2.575  2.805  3.424  2.374  3.35   2.637  2.966  3.59   3.932  1.714]
 [ 2.546  3.328  3.193  3.862  3.937  3.278  2.564  3.128  2.282  2.606  3.094  2.075  2.09   3.101  3.256  3.328  2.506  2.65   2.415  3.282  3.273  3.19   3.937  3.207]
 [ 3.131  3.534  3.385  3.172  4.102  4.025  2.58   2.971  2.289  2.624  2.672  2.09   2.12   3.556  3.502  3.534  2.464  2.792  2.411  3.524  3.543  3.963  4.102  3.224]
 [ 2.905  3.65   3.516  3.726  3.628  3.9    3.08   2.962  3.116  3.033  3.628  1.672  1.16   3.614  2.981  3.65   3.237  2.033  3.316  3.234  3.639  3.711  3.979  2.067]
 [ 2.567  3.518  3.321  3.507  3.82   3.824  2.85   2.788  2.689  2.838  3.425  1.511  1.186  3.351  3.032  3.518  2.937  1.663  2.867  3.243  3.473  3.608  3.992  2.132]
 [ 3.202  3.756  3.641  3.351  3.587  3.915  2.869  2.897  2.652  2.869  3.491  1.348  1.173  3.666  3.065  3.756  3.047  1.9    2.972  3.362  3.743  3.775  3.963  2.017]
 [ 2.846  3.007  2.913  3.941  3.629  3.502  3.059  3.771  2.991  3.025  3.54   2.044  1.89   3.303  2.816  3.007  3.201  2.604  3.149  2.869  3.124  3.252  3.969  2.216]
 [ 3.159  3.604  3.498  3.792  3.996  3.997  2.768  2.919  2.394  2.787  3.193  1.749  1.667  3.635  3.408  3.604  2.825  3.425  2.772  3.491  3.615  3.938  4.062  3.182]
 [ 2.592  2.98   2.919  3.946  3.567  3.794  2.882  3.767  2.65   2.875  3.537  1.96   1.921  3.487  2.82   2.98   3.178  2.592  2.965  2.871  3.14   3.764  3.933  2.253]
 [ 2.864  3.659  3.495  3.391  3.723  3.884  2.866  2.801  2.678  2.854  3.444  1.439  1.176  3.505  3.067  3.659  2.96   1.7    2.908  3.321  3.626  3.707  3.981  2.084]
 [ 3.124  3.739  3.629  3.478  3.598  3.913  2.953  2.947  2.771  2.922  3.533  1.432  1.168  3.653  3.037  3.739  3.132  1.976  3.087  3.329  3.715  3.768  3.969  2.03 ]
 [ 3.062  3.732  3.573  3.812  3.581  3.903  3.188  2.78   3.281  3.133  3.697  1.721  0.887  3.569  2.97   3.732  3.2    1.549  3.485  3.307  3.704  3.769  3.964  1.959]
 [ 2.181  3.49   3.358  3.989  3.563  3.768  3.357  3.852  3.46   3.298  3.842  1.713  0.798  3.499  2.628  3.49   3.672  1.569  3.58   2.955  3.501  3.531  3.982  1.427]
 [ 2.868  3.589  3.255  2.982  4.061  3.968  2.582  2.312  2.408  2.604  3.167  1.322  1.288  3.238  3.308  3.589  2.54   1.415  2.498  3.444  3.51   3.634  4.063  2.922]]

Summary

Datetime 2015-12-29-07-01-49
Monitoring type private
RandomSeed 282
Repeats 1000
Average ts_length 32.856
Number of strategies 24
Str No. Strategy name Average(session based) Rank(session based) Average(stage based) Rank(stage based) 備考
18 kandori.Strategy18 3.359977969 1 3.228364555 1 WSLS'
13 kandori.Strategy13 3.287460052 2 3.137979618 4 CCDDDD
22 kandori.Strategy22 3.281387516 3 3.148252349 2
14 kandori.Strategy14 3.276439015 4 3.143196829 3 WSLS'
1 kandori.Strategy1 3.26006426 5 3.12564803 6
21 kandori.Strategy21 3.259766841 6 3.110083267 8 WSLS'
3 kandori.Strategy3 3.259031214 7 3.105561009 9 WSLS'
16 kandori.Strategy16 3.239432858 8 3.084025749 13 WSLS
2 kandori.Strategy2 3.239432858 9 3.084025749 12 WSLS
17 kandori.Strategy17 3.227942348 10 3.069401327 14 TFT'
6 kandori.Strategy6 3.227373162 11 3.09021437 11 WSLS'
4 kandori.Strategy4 3.22501653 12 3.124100875 7
23 kandori.Strategy23 3.22174583 13 3.104321382 10
19 kandori.Strategy19 3.213839502 14 3.057152803 15 TFT
20 kandori.Strategy20 3.201023476 15 3.032889731 17 WSLS'
11 kandori.Strategy11 3.186472045 16 3.033648725 16 TFT'
7 kandori.Strategy7 3.16280871 17 2.995780197 19 TFT'
15 kandori.Strategy15 3.159724338 18 2.985872697 20 WSLS'
12 kandori.Strategy12 3.157521914 19 3.005357342 18
8 kandori.Strategy8 3.157403557 20 3.137817294 5 HIST
10 kandori.Strategy10 3.152953322 21 2.983149323 21 TFT'
24 kandori.Strategy24 3.146712115 22 2.940208156 23
5 kandori.Strategy5 3.103164667 23 2.938215876 24
9 kandori.Strategy9 3.06455783 24 2.949860122 22 STFT
average3.2113021643.067296974

戦略別, セッション平均の分布


In [52]:
rounds = 1000 * 2
strategies = 24
max_ts = 100

# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(22, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.1, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=14)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()


基本統計量


In [75]:
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]

display(statistics.iloc[:, :12])
display(statistics.iloc[:, 12:])


Str No. 1 2 3 4 5 6 7 8 9 10 11 12
ranking 5.000 8.000 6.000 11.000 23.000 10.000 18.000 17.000 24.000 21.000 16.000 20.000
count 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000
mean 3.255 3.239 3.252 3.231 3.098 3.232 3.155 3.159 3.059 3.141 3.186 3.144
std 0.834 0.871 0.838 0.844 0.960 0.931 0.752 0.939 0.810 0.757 0.835 0.700
min 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000
25% 2.700 2.750 2.750 2.687 2.763 2.947 2.600 2.629 2.378 2.600 2.567 2.611
50% 3.551 3.579 3.571 3.515 3.333 3.565 3.176 3.467 2.889 3.150 3.421 3.167
75% 4.000 3.952 3.952 4.000 3.871 4.000 3.889 4.000 3.750 3.864 4.000 3.500
max 4.909 4.909 4.909 4.200 4.250 4.769 4.667 4.099 5.000 4.667 4.300 4.750
Str No. 13 14 15 16 17 18 19 20 21 22 23 24
ranking 4.000 3.000 19.000 9.000 12.000 1.000 14.000 15.000 7.000 2.000 13.000 22.000
count 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000 48000.000
mean 3.272 3.274 3.152 3.239 3.229 3.359 3.215 3.196 3.247 3.283 3.229 3.130
std 0.725 0.836 0.855 0.871 0.740 0.732 0.738 0.863 0.861 0.886 0.954 0.915
min 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000
25% 2.655 2.923 2.700 2.750 2.671 2.949 2.620 2.727 2.800 3.000 2.947 2.538
50% 3.429 3.562 3.360 3.579 3.333 3.571 3.333 3.481 3.571 3.618 3.577 3.250
75% 3.889 4.000 3.831 3.952 4.000 4.000 4.000 3.889 3.955 3.955 4.000 3.706
max 4.875 4.923 4.917 4.909 4.400 4.923 4.500 4.900 4.889 4.909 4.143 5.000

期数による平均利得の変化


In [76]:
rounds = 1000 * 2
strategies = 24
max_ts = 100

# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [18, 13, 2, 19, 9]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[18-1], color='red', linewidth=2, label="18 (WSLS’)")
plt.plot(t_list, average_matrix[13-1], color='orange', linewidth=2, label="13 (CCDDDD)")
plt.plot(t_list, average_matrix[2-1], color='blue', linewidth=2, label="2 (WSLS)")
plt.plot(t_list, average_matrix[19-1], color='green', linewidth=2, label="19 (TFT)")
plt.plot(t_list, average_matrix[9-1], color='purple', linewidth=2, label="9 (STFT)")
plt.legend()
plt.show()


トリム平均

セッション・ベース平均から、期数の短いセッション・長いセッションそれぞれ5%ずつを除外して平均を求める
※区間の端にタイがある場合は、重みを調整する(例: 48位: 1, 49位: 2, 50位: 2, 51位: 2, 52位: 3なら、49位〜51位の平均利得の和を1/3倍して計算する)


In [88]:
def trim_mean(ts_length, aves, width):
    size = ts_length.size
    hist = {}
    for t in ts_length:
        hist[t] = hist.get(t, 0) + 1
    lower_b = size * (1-width) / 2
    upper_b = size * (1 - (1-width)/2)
    s = 0
    total = 0
    for ts, num in sorted(hist.items()):
        old_s = s
        s += num
        if old_s <= lower_b < s:
            total += (s-lower_b) * aves[ts-1]
        elif old_s <= upper_b < s:
            total += (upper_b-old_s+1) * aves[ts-1]
        elif lower_b <= s <= upper_b:
            total += num * aves[ts-1]
        elif s > upper_b:
            break
    return total / (size * width)

rounds = 1000 * 2
strategies = 24
max_ts = ts_length.max()
    
# 読み込み
df = pd.read_csv('./contest4/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜max期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)
for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

for i in range(strategies):
    print(trim_mean(ts_length, average_matrix[i], 0.9))


3.25475683207
3.23456126734
3.25504114444
3.21601949933
3.09356509615
3.22307477507
3.15391179269
3.13443489517
3.0103502954
3.14344595412
3.18095405612
3.13667270937
3.28599544834
3.27353009013
3.14903729919
3.23456126734
3.22578880067
3.3599210819
3.20719064471
3.19433168715
3.25648101803
3.27722858835
3.21425669986
3.09324189293
Str No. Average(session based) Rank(session based) Average(stage based) Rank(stage based) Average(90% trimmed) Rank(trimmed) 備考
18 3.35352416 1 3.219810292 1 3.354223602 1 WSLS'
13 3.326308014 2 3.182248494 2 3.327682945 2 CCDDDD
22 3.259068663 3 3.121244482 5 3.254935011 4
14 3.258886509 4 3.122727237 4 3.256160683 3 WSLS'
1 3.256299103 5 3.132024724 3 3.250285117 5
3 3.240387724 6 3.082433491 9 3.236639676 6 WSLS'
21 3.238405281 7 3.083776638 8 3.235193757 7 WSLS'
2 3.215812884 8 3.054822228 14 3.210963579 9 WSLS
16 3.215812884 9 3.054822228 15 3.210963579 10 WSLS
17 3.215547504 10 3.063675088 11 3.21302084 8 TFT'
19 3.213334955 11 3.06115156 12 3.207154339 11 TFT
6 3.197763649 12 3.056192503 13 3.193193311 12 WSLS'
12 3.197073568 13 3.045809911 16 3.178046947 16
20 3.192768288 14 3.020367533 17 3.186697798 13 WSLS'
4 3.191465329 15 3.086214152 7 3.181888575 14
23 3.188569289 16 3.06617612 10 3.180589184 15
7 3.166979223 17 3.001625671 19 3.158844309 17 TFT'
15 3.161225612 18 2.983885545 21 3.151458884 19 WSLS'
11 3.159787981 19 3.004255063 18 3.154021739 18 TFT'
24 3.158548933 20 2.940998137 23 3.101466013 21
10 3.157508886 21 2.988733446 20 3.148909213 20 TFT'
8 3.121529725 22 3.104081314 6 3.096718772 22 HIST
9 3.088360193 23 2.962527525 22 3.030839327 24 STFT
5 3.072941197 24 2.902704555 24 3.063080194 23
average3.2113021643.0672969743.200348035

ほぼセッションベース平均と同じ。

Case5: imperfect private monitoring(尾山ゼミ+神取ゼミの戦略)

自分自身との対戦無しのケース

結果の生データ(csv)は contest5/data
戦略は user_strategies
戦略のオートマトンは contest5/automaton5.pdf


In [89]:
# 「相手の」シグナルが協調か攻撃かを(ノイズ付きで)返す
def private_signal(actions, random_state):
    pattern = [[0, 0], [0, 1], [1, 0], [1, 1]]
    # 例えば実際の行動が(0, 1)なら、シグナルは(1, 0)である可能性が最も高い
    signal_probs = [[.9, .02, .02, .06], [.02, .06, .9, .02], [.02, .9, .06, .02], [.06, .02, .02, .9]]
    p = random_state.uniform()
    if actions[0] == 0 and actions[1] == 0:
        return [0, 0] if p < 0.9 else [0, 1] if p < 0.92 else [1, 0] if p < 0.94 else [1, 1]
    elif actions[0] == 0 and actions[1] == 1:
        return [1, 0] if p < 0.9 else [0, 0] if p < 0.92 else [1, 1] if p < 0.94 else [0, 1]
    elif actions[0] == 1 and actions[1] == 0:
        return [0, 1] if p < 0.9 else [1, 1] if p < 0.92 else [0, 0] if p < 0.94 else [1, 0]
    elif actions[0] == 1 and actions[1] == 1:
        return [1, 1] if p < 0.9 else [1, 0] if p < 0.92 else [0, 1] if p < 0.94 else [0, 0]
    else:
        raise ValueError

strategies = [Strategy1, Strategy2, Strategy3, Strategy4, Strategy5,
                    Strategy6, Strategy7, Strategy8, Strategy9, Strategy10,
                    Strategy11, Strategy12, Strategy13, Strategy14, Strategy15,
                    Strategy16, Strategy17, Strategy18, Strategy19, Strategy20, 
                    Strategy21, Strategy22, Strategy23, Strategy24, 
                    Iida_iprm, KatoStrategy, Self_Centered_private, ImPrivStrategy,
                    GrimTrigger, MyStrategy, beeleb, OyamaImperfectPrivateMonitoring, ogawa, yamagishi]
    
game = pl.RepeatedMatrixGame(payoff, strategies, signal=private_signal, ts_length=ts_length, repeat=1000)
game.play(mtype="private", random_seed=seed, record=False)


Start
The object has 34 strategy functions below
--------------------------------------------------
1. kandori.Strategy1
2. kandori.Strategy2
3. kandori.Strategy3
4. kandori.Strategy4
5. kandori.Strategy5
6. kandori.Strategy6
7. kandori.Strategy7
8. kandori.Strategy8
9. kandori.Strategy9
10. kandori.Strategy10
11. kandori.Strategy11
12. kandori.Strategy12
13. kandori.Strategy13
14. kandori.Strategy14
15. kandori.Strategy15
16. kandori.Strategy16
17. kandori.Strategy17
18. kandori.Strategy18
19. kandori.Strategy19
20. kandori.Strategy20
21. kandori.Strategy21
22. kandori.Strategy22
23. kandori.Strategy23
24. kandori.Strategy24
25. Iida_imperfect_private.Iida_iprm
26. kato.KatoStrategy
27. ikegami_imperfect_private.Self_Centered_private
28. mhanami_Imperfect_Private_Strategy.ImPrivStrategy
29. tsuyoshi.GrimTrigger
30. gistfile1.MyStrategy
31. beeleb_Strategy.beeleb
32. oyama.OyamaImperfectPrivateMonitoring
33. ogawa.ogawa
34. yamagishi_impd.yamagishi
--------------------------------------------------
Repeats: 1000
Total time series length: 32856

Score table:
各セッションを重率1で平均した得点
[[ 3.347  3.703  3.614 ...,  3.45   3.291  3.162]
 [ 3.533  3.783  3.686 ...,  3.388  3.225  3.218]
 [ 3.538  3.777  3.688 ...,  3.365  3.208  3.212]
 ..., 
 [ 3.249  3.207  3.149 ...,  3.558  3.407  3.415]
 [ 3.08   3.169  3.079 ...,  3.506  3.567  3.39 ]
 [ 3.074  3.241  3.2   ...,  3.525  3.48   3.225]]

各ステージゲームを重率1で平均した得点
[[ 2.979  3.652  3.533 ...,  3.091  2.788  2.79 ]
 [ 3.202  3.756  3.641 ...,  3.082  2.566  2.972]
 [ 3.251  3.746  3.637 ...,  3.05   2.572  2.944]
 ..., 
 [ 2.834  2.976  2.886 ...,  3.336  2.909  3.217]
 [ 2.711  3.082  2.966 ...,  3.134  3.136  3.048]
 [ 2.592  2.98   2.919 ...,  3.292  3.008  2.965]]

Summary

Datetime 2015-12-29-08-02-38
Monitoring type private
RandomSeed 282
Repeats 1000
Average ts_length 32.856
Number of strategies 34
Str No. Strategy name Average(session based) Rank(session based) Average(stage based) Rank(stage based) 備考
27 ikegami_imperfect_private.Self_Centered_private 3.366868101 1 3.218934853 2 20%
28 mhanami_Imperfect_Private_Strategy.ImPrivStrategy 3.361933787 2 3.228403533 1 2T2FT
25 Iida_imperfect_private.Iida_iprm 3.324539566 3 3.141753588 7
4 kandori.Strategy4 3.303921511 4 3.182839736 4
18 kandori.Strategy18 3.298573086 5 3.094787504 14 WSLS'
17 kandori.Strategy17 3.289954129 6 3.113827808 10 TFT'
30 gistfile1.MyStrategy 3.28804656 7 3.141913824 6 TFT'
23 kandori.Strategy23 3.28293638 8 3.132734732 8
11 kandori.Strategy11 3.276110459 9 3.116849461 9 TFT'
19 kandori.Strategy19 3.275622813 10 3.102621153 11 TFT
34 yamagishi_impd.yamagishi 3.275622813 11 3.102621153 12 TFT
29 tsuyoshi.GrimTrigger 3.273337259 12 3.096912194 13 TFT'
31 beeleb_Strategy.beeleb 3.268154747 13 3.14303055 5
14 kandori.Strategy14 3.265929641 14 3.074780414 17 WSLS'
32 oyama.OyamaImperfectPrivateMonitoring 3.264771416 15 3.094682769 15 TFT'
1 kandori.Strategy1 3.264351241 16 3.074011909 18
8 kandori.Strategy8 3.261388173 17 3.201616412 3 HIST
33 ogawa.ogawa 3.257102734 18 3.08465953 16
22 kandori.Strategy22 3.247990043 19 3.05550647 19
6 kandori.Strategy6 3.245408946 20 3.055048142 20 WSLS'
21 kandori.Strategy21 3.231870454 21 3.020929117 22 WSLS'
3 kandori.Strategy3 3.221153715 22 3.004112867 25 WSLS'
7 kandori.Strategy7 3.220640855 23 3.032698388 21 TFT'
10 kandori.Strategy10 3.209269086 24 3.017327841 23 TFT'
16 kandori.Strategy16 3.204363061 25 2.98581287 27 WSLS
2 kandori.Strategy2 3.204363061 26 2.98581287 28 WSLS
13 kandori.Strategy13 3.199255907 27 2.993635328 26 CCDDDD
5 kandori.Strategy5 3.17071256 28 2.970803077 30
26 kato.KatoStrategy 3.168234447 29 3.005790866 24
20 kandori.Strategy20 3.156422163 30 2.929961758 32 WSLS'
12 kandori.Strategy12 3.154547863 31 2.957956466 31
15 kandori.Strategy15 3.118463618 32 2.890986873 33 WSLS'
9 kandori.Strategy9 3.113619613 33 2.976513377 29 STFT
24 kandori.Strategy24 3.013624736 34 2.759430635 34
average3.2376207223.058509061

戦略別, セッション平均の分布


In [97]:
rounds = 1000 * 2
strategies = 34
max_ts = 100

# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# 行: プレイヤー, 列: 1000*2セッション分の平均利得
average_matrix = np.zeros((rounds*strategies, strategies), dtype=float)

for s in range(1, strategies+1):
    for i, opponent in enumerate(df[str(s)].columns.values):
        average_matrix[i*rounds:(i+1)*rounds, s-1] = df[str(s)][str(opponent)]

averages = np.zeros(strategies, dtype=float)
stds = np.zeros(strategies, dtype=float)
ranking = np.zeros(strategies, dtype=int)
for i in range(strategies):
    averages[i] = average_matrix[:, i].mean()
    stds[i] = average_matrix[:, i].std()
ranking = np.argsort(averages)[::-1]+1

fig, ax = plt.subplots(figsize=(28, 12))
bp = ax.boxplot(average_matrix, 0, '')
plt.grid()
plt.xlabel('戦略番号')
plt.ylabel('1セッションの平均利得')
ax.set_xlim([0, strategies+0.5])
ax.set_ylim([-0.1, 5.8])
plt.title('戦略別, 全セッションの平均利得の分布')
ax.text(0.1, 5.3, "ranking\nave\nstd", ha = 'center', va = 'center', color="black", size=15)
for i in range(strategies):
    ax.text(i+1, 5.3, "{0:.0f}\n{1:.3f}\n{2:.3f}"
            .format(np.where(ranking == i+1)[0][0]+1, averages[i], stds[i]), ha = 'center', va = 'center', color="black", size=14)

plt.show()


基本統計量


In [98]:
# fundamental statistics
a_df = pd.DataFrame(average_matrix, columns=range(1, strategies+1))
statistics = a_df.describe()
# add ranking row
df2 = pd.DataFrame([[np.where(ranking == i+1)[0][0]+1 for i in range(strategies)]],
                   columns=range(1, strategies+1), dtype=int, index=["ranking"])
frames = [df2, statistics]
statistics = pd.concat(frames)
statistics.columns.names = ["Str No."]

display(statistics.iloc[:, :12])
display(statistics.iloc[:, 12:24])
display(statistics.iloc[:, 24:])


Str No. 1 2 3 4 5 6 7 8 9 10 11 12
ranking 16.000 26.000 22.000 4.000 28.000 20.000 23.000 17.000 33.000 24.000 9.000 31.000
count 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000
mean 3.264 3.204 3.221 3.304 3.171 3.245 3.221 3.261 3.114 3.209 3.276 3.155
std 0.821 0.895 0.864 0.835 0.964 0.935 0.759 0.900 0.846 0.766 0.810 0.752
min 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000
25% 2.681 2.667 2.667 2.769 2.750 2.909 2.654 2.743 2.400 2.643 2.695 2.545
50% 3.542 3.542 3.541 3.636 3.404 3.593 3.286 3.619 3.000 3.259 3.551 3.136
75% 4.000 4.000 4.000 4.000 4.000 4.000 4.000 4.000 3.870 4.000 4.000 3.609
max 4.923 4.917 4.909 4.167 4.250 4.778 4.750 4.111 5.000 4.750 4.333 4.818
Str No. 13 14 15 16 17 18 19 20 21 22 23 24
ranking 27.000 14.000 32.000 25.000 6.000 5.000 10.000 30.000 21.000 19.000 8.000 34.000
count 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000
mean 3.199 3.266 3.118 3.204 3.290 3.299 3.276 3.156 3.232 3.248 3.283 3.014
std 0.742 0.854 0.891 0.895 0.737 0.760 0.734 0.892 0.880 0.919 0.940 1.000
min 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000
25% 2.540 2.860 2.600 2.667 2.729 2.750 2.685 2.625 2.739 2.900 2.974 2.333
50% 3.300 3.564 3.312 3.542 3.462 3.520 3.439 3.415 3.556 3.600 3.656 3.105
75% 3.800 4.000 3.846 4.000 4.000 4.000 4.000 3.910 4.000 4.000 4.000 3.667
max 4.895 4.917 4.909 4.917 4.400 4.923 4.667 4.917 4.923 4.933 4.200 5.000
Str No. 25 26 27 28 29 30 31 32 33 34
ranking 3.000 29.000 1.000 2.000 12.000 7.000 13.000 15.000 18.000 11.000
count 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000 68000.000
mean 3.325 3.168 3.367 3.362 3.273 3.288 3.268 3.265 3.257 3.276
std 0.745 0.767 0.695 0.698 0.852 0.875 0.854 0.790 0.775 0.734
min 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
25% 2.888 2.537 2.812 2.902 2.875 2.789 2.650 2.705 2.697 2.685
50% 3.450 3.188 3.538 3.538 3.566 3.636 3.622 3.494 3.429 3.439
75% 4.000 3.758 4.000 4.000 4.000 4.000 4.000 4.000 4.000 4.000
max 4.900 4.800 4.923 4.400 4.923 4.250 4.100 4.900 4.500 4.667

期数による平均利得の変化


In [99]:
rounds = 1000 * 2
strategies = 34
max_ts = 100

# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜100期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)

for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]

    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

fig, ax = plt.subplots(figsize=(20, 10))
plt.title("average payoff trend")
plt.xlabel("ts_length")
plt.ylabel("average payoff")
t_list = [i for i in range(1, max_ts+1)]

for s in range(1, strategies+1):
    if s in [27, 28, 18, 13, 9, 8]:
        pass
    else:
        average_list = average_matrix[s-1]
        plt.plot(t_list, average_list, color='#bbbbbb')

plt.plot(t_list, average_matrix[27-1], color='red', linewidth=2, label="27 (20%)")
plt.plot(t_list, average_matrix[28-1], color='blue', linewidth=2, label="28 (2T2FT)")
plt.plot(t_list, average_matrix[19-1], color='magenta', linewidth=2, label="19 (TFT)")
plt.plot(t_list, average_matrix[18-1], color='green', linewidth=2, label="18 (WSLS’)")
plt.plot(t_list, average_matrix[13-1], color='purple', linewidth=2, label="13 (CCDDDD)")
plt.plot(t_list, average_matrix[9-1], color='brown', linewidth=2, label="9 (STFT)")
plt.plot(t_list, average_matrix[8-1], color='orange', linewidth=2, label="8 (HIST)")

plt.legend()
plt.show()


トリム平均

セッション・ベース平均から、期数の短いセッション・長いセッションそれぞれ5%ずつを除外して平均を求める


In [100]:
def trim_mean(ts_length, aves, width):
    size = ts_length.size
    hist = {}
    for t in ts_length:
        hist[t] = hist.get(t, 0) + 1
    lower_b = size * (1-width) / 2
    upper_b = size * (1 - (1-width)/2)
    s = 0
    total = 0
    for ts, num in sorted(hist.items()):
        old_s = s
        s += num
        if old_s <= lower_b < s:
            total += (s-lower_b) * aves[ts-1]
        elif old_s <= upper_b < s:
            total += (upper_b-old_s+1) * aves[ts-1]
        elif lower_b <= s <= upper_b:
            total += num * aves[ts-1]
        elif s > upper_b:
            break
    return total / (size * width)

rounds = 1000 * 2
strategies = 34
max_ts = ts_length.max()
    
# 読み込み
df = pd.read_csv('./contest5/data/set_result.csv', index_col=[0, 1], header=[0, 1])

# ts_lengthの長い順に並び替え
ordered_df = df.sortlevel(level="ts_length")

# 行: プレイヤー, 列: ts_lengthが1〜max期の時の平均利得
average_matrix = np.zeros((strategies, max_ts), dtype=float)
for t in range(1, max_ts+1):
    df_t = df.iloc[df.index.get_level_values('ts_length') == t]
    for s in range(1, strategies+1):
        average = df_t[str(s)].mean().mean()
        average_matrix[s-1, t-1] = average

for i in range(strategies):
    print(trim_mean(ts_length, average_matrix[i], 0.9))


3.26128605178
3.19971796592
3.21704923649
3.29734163396
3.16464772432
3.24332506865
3.21301979831
3.24588266352
3.06042554808
3.20107577233
3.27201102596
3.13313474672
3.19246937039
3.26435406728
3.10604335029
3.19971796592
3.28866020411
3.2977831816
3.26989546607
3.14820588064
3.22917799939
3.24365673834
3.27869565175
2.95221585188
3.32541774226
3.14646921945
3.36357056738
3.36048783939
3.27022245893
3.28289279676
3.25924746127
3.26086202884
3.25353522365
3.26989546607
Str No. Average(session based) Rank(session based) Average(stage based) Rank(stage based) Average(90% trimmed) Rank(trimmed) 備考
27 3.366868101 1 3.218934853 2 3.363570567 1 20%
28 3.361933787 2 3.228403533 1 3.360487839 2 2T2FT
25 3.324539566 3 3.141753588 7 3.325417742 3
4 3.303921511 4 3.182839736 4 3.297341634 5
18 3.298573086 5 3.094787504 14 3.297783182 4 WSLS'
17 3.289954129 6 3.113827808 10 3.288660204 6 TFT'
30 3.28804656 7 3.141913824 6 3.282892797 7 TFT'
23 3.28293638 8 3.132734732 8 3.278695652 8
11 3.276110459 9 3.116849461 9 3.272011026 9 TFT'
19 3.275622813 10 3.102621153 11 3.269895466 11 TFT
34 3.275622813 11 3.102621153 12 3.269895466 12 TFT
29 3.273337259 12 3.096912194 13 3.270222459 10 TFT'
31 3.268154747 13 3.14303055 5 3.259247461 16
14 3.265929641 14 3.074780414 17 3.264354067 13 WSLS'
32 3.264771416 15 3.094682769 15 3.260862029 15 TFT'
1 3.264351241 16 3.074011909 18 3.261286052 14
8 3.261388173 17 3.201616412 3 3.245882664 18 HIST
33 3.257102734 18 3.08465953 16 3.253535224 17
22 3.247990043 19 3.05550647 19 3.243656738 19
6 3.245408946 20 3.055048142 20 3.243325069 20 WSLS'
21 3.231870454 21 3.020929117 22 3.229177999 21 WSLS'
3 3.221153715 22 3.004112867 25 3.217049236 22 WSLS'
7 3.220640855 23 3.032698388 21 3.213019798 23 TFT'
10 3.209269086 24 3.017327841 23 3.201075772 24 TFT'
16 3.204363061 25 2.98581287 27 3.199717966 26 WSLS
2 3.204363061 26 2.98581287 28 3.199717966 25 WSLS
13 3.199255907 27 2.993635328 26 3.19246937 27 CCDDDD
5 3.17071256 28 2.970803077 30 3.164647724 28
26 3.168234447 29 3.005790866 24 3.146469219 30
20 3.156422163 30 2.929961758 32 3.148205881 29 WSLS'
12 3.154547863 31 2.957956466 31 3.133134747 31
15 3.118463618 32 2.890986873 33 3.10604335 32 WSLS'
9 3.113619613 33 2.976513377 29 3.060425548 33 STFT
24 3.013624736 34 2.759430635 34 2.952215852 34
average3.2376207223.0585090613.228599817

セッション平均とほぼ同じ。

検証

検証1 TFT, WSLS, ALLDの関係

実験4

集計表

実験4では、24戦略の内、TFTに類似した戦略が6、WSLSに類似した戦略が9、ALLDに類似した戦略が1つあった。
結果は1位がWSLSタイプ、2位がALLDタイプの戦略で、全体的にWSLSは高利得、TFTは低利得となった。

スコアテーブルを戦略のタイプ別に集計すると、

タイプ別平均
WSLS TFT ALLD Other kandori total average
WSLS 3.485347614 3.1719659 1.597288599 3.241082345 3.246911304
TFT 3.156388186 3.201584794 2.102810448 3.289310268 3.168095626
ALLD 3.568115816 2.766558346 2.393956929 3.474086488 3.287460052
Other kandori 3.281705328 3.259717721 1.61449243 3.243862807 3.194127049
total 3.33867567 3.191729249 1.762598185 3.263774653 3.211302164

となった。神取ゼミのWSLS, TFT, ALLD以外の8戦略(Other kandori)は3戦略にそれほど大きな影響を与えていないことがわかる。したがって、3タイプだけで元実験を近似できている。

一般に、WSLSが多く、ALLDが少ない環境では、WSLSは高い利得を得られる。
特に戦略18は、通常のWSLSに比べてALLDに強く、1位になった要因だと考えられる。

実験5

集計表

実験5では、34戦略の内、TFTに類似した戦略が11, WSLSが9, ALLDが1つであった。
1位は「過去のシグナルのうち20%以上がBならD, それ以外ならC」という戦略、2位は2T2FTであった。全体的にTFTが高利得、ALLDとWSLSは低利得となった。

タイプ別に集計すると、

タイプ別平均
WSLS TFT ALLD Other kandori Other oyama total average
WSLS 3.485347614 3.294184513 1.597288599 3.241082345 2.844703858 3.216283083
TFT 3.217917263 3.418662665 1.999373767 3.280819084 3.198661754 3.258993526
ALLD 3.568115816 2.947953446 2.393956929 3.474086488 2.80950435 3.199255907
Other kandori 3.281705328 3.384906732 1.61449243 3.243862807 2.977608253 3.212434063
Other oyama 3.290522046 3.369423514 2.085169256 3.355591673 3.161811508 3.276979919
total 3.324693738 3.356684553 1.826601514 3.278285245 3.036089469 3.237620722

スコアテーブルを戦略のタイプごとに集計し直すと、実験5で「WSLSの利得 < TFTの利得」となった要因は、TFT同士の対戦の利得が実験4の場合よりも高くなったこと、及び尾山ゼミのWSLS, TFT, ALLDでない残りの5戦略との対戦でTFTがより高い利得を得たことにある。

検証2 「過去全ての履歴の内◯◯%以上BならDを出す戦略」は安定して高い利得を得られるか

タイプ別平均
WSLS TFT ALLD Prob Other kandori Other oyama total average
WSLS 3.485347614 3.294184513 1.597288599 2.559611894 3.241082345 2.915976849 3.216283083
TFT 3.217917263 3.418662665 1.999373767 2.99366014 3.280819084 3.249912158 3.258993526
ALLD 3.568115816 2.947953446 2.393956929 2.536744069 3.474086488 2.87769442 3.199255907
Prob 3.553529215 3.278944542 2.286598783 3.318680869 3.513118831 3.178283059 3.366868101
Other kandori 3.281705328 3.384906732 1.61449243 2.727741801 3.243862807 3.040074867 3.212434063
Other oyama 3.224770254 3.392043256 2.034811875 2.927390992 3.316209883 3.206494414 3.254507873
total 3.324693738 3.356684553 1.826601514 2.80452035 3.278285245 3.093981748 3.237620722

実験5のスコアテーブルを再度タイプ別に集計すると、ProbはTFT, ALLDとの対戦でそれなりの利得を得、WSLSに対してはALLDなみに高い利得をえていることがわかる。


In [ ]: