Demo of the two thirds library

This notebook gives a demo of the two thirds library which can be used to analyse runnings of the two thirds library.

To install the library you can run pip install twothirds or get the git repository here.

A basic single game



In [8]:

    
import twothirds 
import random

Let as assume we have the following list of random guesses:



In [9]:

    
N = 2000
guesses = [int(round(random.triangular(0, 100, 44), 0)) for k in range(N)]

Now we create a single game instance



In [10]:

    
g = twothirds.TwoThirdsGame(guesses)

Let's find the two thirds of the average:



In [11]:

    
g.two_thirds_of_the_average()









    Out[11]:





32.324333333333335

We can identify the winning guess:



In [12]:

    
g.find_winner()









    Out[12]:





('Anonymous', 32)

Note that the data could also be in the form of a dictionary that maps names of players to guesses:



In [13]:

    
import string

def randomword(length):
    """A function to generate a random name: http://stackoverflow.com/questions/2030053/random-strings-in-python"""
    return ''.join(random.choice(string.lowercase) for i in range(length))

guesses = {randomword(8):guess for guess in guesses}



In [14]:

    
g = twothirds.TwoThirdsGame(guesses)



In [15]:

    
g.two_thirds_of_the_average()









    Out[15]:





32.324333333333335

We see that quite a few people won.



In [16]:

    
g.find_winner()









    Out[16]:





('bftnvtnk',
 'bhdbctta',
 'cpkarohu',
 'dwhtspmf',
 'eghicvyv',
 'erfhbiao',
 'etycgdnd',
 'fzvjrjmk',
 'ienqvajd',
 'irawscer',
 'jkwhrrkd',
 'krwaztxb',
 'lvrqqzij',
 'mpcsqhzw',
 'nstdddev',
 'nvfkkcme',
 'nxjjvdat',
 'oohwlduc',
 'oznasprl',
 'phudmnto',
 'pmlsixjf',
 'rzqlceqp',
 'sbrlodcz',
 'tbaloqzd',
 'tdclrlpt',
 'tuyqawzj',
 'ueaqnlon',
 'uenmsrfe',
 'vnlrvqgl',
 'vwqnkxfz',
 'wdfbbddk',
 'yyaeuhsa',
 'zaqnavvx',
 'zjonvgwe',
 32)

Handling data

Note that it might be much easier to collect the data in a spreadsheet. This library allows for that. Let's first write the data file we will be using (in effect doing things backwords).



In [17]:

    
import pandas
df = pandas.DataFrame(guesses.items())
df.to_csv('demo.csv', index=False)

We can now read in this data:



In [18]:

    
data = twothirds.Data('demo.csv')
data.read()

The data file has a dataframe attribute:



In [19]:

    
data.df.head()

We can get the data in a nicer format and ready for use. The format is a list of objects representing every play of the game (so for example we could have a file with muliple columns for each game).



In [20]:

    
guesses = data.out()[0]

Here we create the game (as above):



In [21]:

    
g = twothirds.TwoThirdsGame(guesses)



In [22]:

    
g.find_winner()









    Out[22]:





('bftnvtnk',
 'bhdbctta',
 'cpkarohu',
 'dwhtspmf',
 'eghicvyv',
 'erfhbiao',
 'etycgdnd',
 'fzvjrjmk',
 'ienqvajd',
 'irawscer',
 'jkwhrrkd',
 'krwaztxb',
 'lvrqqzij',
 'mpcsqhzw',
 'nstdddev',
 'nvfkkcme',
 'nxjjvdat',
 'oohwlduc',
 'oznasprl',
 'phudmnto',
 'pmlsixjf',
 'rzqlceqp',
 'sbrlodcz',
 'tbaloqzd',
 'tdclrlpt',
 'tuyqawzj',
 'ueaqnlon',
 'uenmsrfe',
 'vnlrvqgl',
 'vwqnkxfz',
 'wdfbbddk',
 'yyaeuhsa',
 'zaqnavvx',
 'zjonvgwe',
 32)

Managing an activity

If we have a spreadsheet with multiple columns for guesses we can read it in and create an activity that will contain all the data and analysis we need. Let's tweak our guesses to have a second guess that should be lower than the first.



In [23]:

    
guesses = [[key, guesses[key], int(random.triangular(0, guesses[key], 1.0 * guesses[key] / 3))] for key in guesses]

Here we write the data to file again:



In [24]:

    
df = pandas.DataFrame(guesses)
df.to_csv('demo.csv', index=False)



In [25]:

    
activity = twothirds.Activity('demo.csv')

We have still got access to the raw data:



In [26]:

    
activity.raw_data.df.head()

We also have an instance for each game:



In [27]:

    
activity.games









    Out[27]:





[<twothirds.single_game.TwoThirdsGame instance at 0x112f0fe60>,
 <twothirds.single_game.TwoThirdsGame instance at 0x112f0fef0>]

The winning guess for each game can be found below:



In [28]:

    
[g.find_winner()[-1] for g in activity.games]









    Out[28]:





[32, 14]

The winners of the first game:



In [29]:

    
activity.games[0].find_winner()[:-1]









    Out[29]:





('bftnvtnk',
 'bhdbctta',
 'cpkarohu',
 'dwhtspmf',
 'eghicvyv',
 'erfhbiao',
 'etycgdnd',
 'fzvjrjmk',
 'ienqvajd',
 'irawscer',
 'jkwhrrkd',
 'krwaztxb',
 'lvrqqzij',
 'mpcsqhzw',
 'nstdddev',
 'nvfkkcme',
 'nxjjvdat',
 'oohwlduc',
 'oznasprl',
 'phudmnto',
 'pmlsixjf',
 'rzqlceqp',
 'sbrlodcz',
 'tbaloqzd',
 'tdclrlpt',
 'tuyqawzj',
 'ueaqnlon',
 'uenmsrfe',
 'vnlrvqgl',
 'vwqnkxfz',
 'wdfbbddk',
 'yyaeuhsa',
 'zaqnavvx',
 'zjonvgwe')

The winners of the second game (there are more of them):



In [30]:

    
activity.games[1].find_winner()[:-1]









    Out[30]:





('agtdjyfm',
 'bbcizfms',
 'bkxjbaws',
 'bsszepzl',
 'cbmlchiw',
 'crynfarr',
 'dgdrwnig',
 'dnzdbeyq',
 'ejipkquj',
 'ggipjwve',
 'gzydmpih',
 'hrhwuuqk',
 'hvpwrqtq',
 'ieraiisi',
 'ievulewi',
 'iezxzvsw',
 'imigsfjz',
 'ipmbtfee',
 'iygdnjvu',
 'izuyxxek',
 'jdlqeqxk',
 'jhzjshdg',
 'jvqrxfjr',
 'kaaratxb',
 'keyvnojc',
 'kiprznwn',
 'kodvvlmv',
 'kyoiqcrx',
 'laumrmvj',
 'lbhjwxui',
 'lmiidziu',
 'ltrdjtpj',
 'mpcsqhzw',
 'nimnptog',
 'nivyzvon',
 'noflqaml',
 'npnznkgy',
 'nydwuujr',
 'ovzdrmhe',
 'qeaskwnt',
 'qsuykuxz',
 'qwywxfvz',
 'qyxrjgpe',
 'qzhxfjwl',
 'rbesbodv',
 'smziwarh',
 'srhmnhug',
 'toyjfloq',
 'tpznifmq',
 'tpzrtkox',
 'tsgkmfyd',
 'tuvszott',
 'ubobpqek',
 'ueaqnlon',
 'vfjustmg',
 'vkvlvvvv',
 'vnvebajo',
 'wakfmiyc',
 'wkbzeaof',
 'wqlalxqt',
 'wrbzfmgb',
 'xfstaxxd',
 'xjbzoopn',
 'xojsvfgh',
 'xxjsiulo',
 'yetrogwb',
 'yibzimez',
 'ynmohdpy',
 'yxekztfu')

The library has some inbuilt plots:



In [31]:

    
%matplotlib inline
activity.analyse()
activity.distplot();
activity.pairplot();









    












    





<matplotlib.figure.Figure at 0x112ccab10>

Finally you can see a summary of everything here:



In [32]:

    
activity









    Out[32]:





=====================
Game 0
---------------------
2/3rds of the average: 32.32
Winning guess: 32
Winner(s): ('bftnvtnk', 'bhdbctta', 'cpkarohu', 'dwhtspmf', 'eghicvyv', 'erfhbiao', 'etycgdnd', 'fzvjrjmk', 'ienqvajd', 'irawscer', 'jkwhrrkd', 'krwaztxb', 'lvrqqzij', 'mpcsqhzw', 'nstdddev', 'nvfkkcme', 'nxjjvdat', 'oohwlduc', 'oznasprl', 'phudmnto', 'pmlsixjf', 'rzqlceqp', 'sbrlodcz', 'tbaloqzd', 'tdclrlpt', 'tuyqawzj', 'ueaqnlon', 'uenmsrfe', 'vnlrvqgl', 'vwqnkxfz', 'wdfbbddk', 'yyaeuhsa', 'zaqnavvx', 'zjonvgwe')
=====================
Game 1
---------------------
2/3rds of the average: 14.08
Winning guess: 14
Winner(s): ('agtdjyfm', 'bbcizfms', 'bkxjbaws', 'bsszepzl', 'cbmlchiw', 'crynfarr', 'dgdrwnig', 'dnzdbeyq', 'ejipkquj', 'ggipjwve', 'gzydmpih', 'hrhwuuqk', 'hvpwrqtq', 'ieraiisi', 'ievulewi', 'iezxzvsw', 'imigsfjz', 'ipmbtfee', 'iygdnjvu', 'izuyxxek', 'jdlqeqxk', 'jhzjshdg', 'jvqrxfjr', 'kaaratxb', 'keyvnojc', 'kiprznwn', 'kodvvlmv', 'kyoiqcrx', 'laumrmvj', 'lbhjwxui', 'lmiidziu', 'ltrdjtpj', 'mpcsqhzw', 'nimnptog', 'nivyzvon', 'noflqaml', 'npnznkgy', 'nydwuujr', 'ovzdrmhe', 'qeaskwnt', 'qsuykuxz', 'qwywxfvz', 'qyxrjgpe', 'qzhxfjwl', 'rbesbodv', 'smziwarh', 'srhmnhug', 'toyjfloq', 'tpznifmq', 'tpzrtkox', 'tsgkmfyd', 'tuvszott', 'ubobpqek', 'ueaqnlon', 'vfjustmg', 'vkvlvvvv', 'vnvebajo', 'wakfmiyc', 'wkbzeaof', 'wqlalxqt', 'wrbzfmgb', 'xfstaxxd', 'xjbzoopn', 'xojsvfgh', 'xxjsiulo', 'yetrogwb', 'yibzimez', 'ynmohdpy', 'yxekztfu')

Here is a larger example



In [33]:

    
import twothirds
activity = twothirds.Activity('data.csv')



In [34]:

    
activity.analyse()



In [35]:

    
%matplotlib inline
activity.distplot();
activity.pairplot();









    












    





<matplotlib.figure.Figure at 0x10e0d7b10>



In [ ]:

	0	1
0	ggxeovzs	33
1	gmsfcvsy	40
2	herxpukf	44
3	hafbntkw	62
4	gasaldxu	36

	0	1	2
0	ggxeovzs	33	8
1	gmsfcvsy	40	19
2	herxpukf	44	17
3	hafbntkw	62	36
4	gasaldxu	36	11