Background

This notebook seeks to quantify the value of leaving a certain number of tiles in the bag during the pre-endgame based on a repository of games. We will then implement these values as a pre-endgame heuristic in the Macondo speedy player to improve simulation quality.

Initial questions:

1. What is the probability that you will go out first if you make a play leaving N tiles in the bag?
2. (slightly harder) What is the expected improvement in end-of-game spread after making a play that leaves N tiles in the bag?

Implementation details

We'll need to make several passes through the log file to obtain the following:

• Final spread of each simulated game
• Delta between pre-endgame and final spread

Assumptions

• We're only analyzing complete games
• The last two rows in the log file for a given game ID are also the last two turns of the game

Next steps

• Standardize sign convention for spread.
• Start figuring out how to calculate pre-endgame spread
``````

In [1]:

import csv
import pandas as pd

log_folder = '../logs/'
log_file = log_folder + 'log_20200411_short.csv'

``````
``````

In [2]:

second_to_last_move_dict = {}
last_move_dict = {}

``````
``````

In [3]:

n=10000000

with open(log_file,'r') as f:

if i<n:
if row[1] in last_move_dict.keys():
second_to_last_move_dict[row[1]] = last_move_dict[row[1]]
last_move_dict[row[1]] = row

if i==n:
break

``````
``````

In [4]:

# whoever made the final move went out first
went_out_first_dict = {game_id:last_move_dict[game_id][0] for game_id in last_move_dict.keys()}

# good sanity check - player 1 should go out a bit more often
print('Analyzing {} games, player 1 went out first {}% of the time'.format(
len(went_out_first_dict),
100*pd.Series(went_out_first_dict).value_counts(normalize=True)[0]))

``````
``````

Analyzing 438887 games, player 1 went out first 50.09307634994885% of the time

``````
``````

In [5]:

for game_id in last_move_dict.keys():
-(int(last_move_dict[game_id][0])-int(second_to_last_move_dict[game_id][0]))

``````
``````

In [6]:

print('The person who went out first won by an average of {} points'.format(pd.Series(spread_dict).mean()))
print('Player 1 won by an average of {} points'.format(pd.Series(p1_minus_p2_spread_dict).mean()))

``````
``````

The person who went out first won by an average of 17.206246710428882 points
Player 1 won by an average of 13.565596611428454 points

``````
``````

In [ ]:

``````
``````

In [ ]:

``````