This notebook gives a demo of the two thirds library which can be used to analyse runnings of the two thirds library.
To install the library you can run pip install twothirds
or get the git repository here.
In [8]:
import twothirds
import random
Let as assume we have the following list of random guesses:
In [9]:
N = 2000
guesses = [int(round(random.triangular(0, 100, 44), 0)) for k in range(N)]
Now we create a single game instance
In [10]:
g = twothirds.TwoThirdsGame(guesses)
Let's find the two thirds of the average:
In [11]:
g.two_thirds_of_the_average()
Out[11]:
We can identify the winning guess:
In [12]:
g.find_winner()
Out[12]:
Note that the data could also be in the form of a dictionary that maps names of players to guesses:
In [13]:
import string
def randomword(length):
"""A function to generate a random name: http://stackoverflow.com/questions/2030053/random-strings-in-python"""
return ''.join(random.choice(string.lowercase) for i in range(length))
guesses = {randomword(8):guess for guess in guesses}
In [14]:
g = twothirds.TwoThirdsGame(guesses)
In [15]:
g.two_thirds_of_the_average()
Out[15]:
We see that quite a few people won.
In [16]:
g.find_winner()
Out[16]:
Note that it might be much easier to collect the data in a spreadsheet. This library allows for that. Let's first write the data file we will be using (in effect doing things backwords).
In [17]:
import pandas
df = pandas.DataFrame(guesses.items())
df.to_csv('demo.csv', index=False)
We can now read in this data:
In [18]:
data = twothirds.Data('demo.csv')
data.read()
The data file has a dataframe attribute:
In [19]:
data.df.head()
Out[19]:
We can get the data in a nicer format and ready for use. The format is a list of objects representing every play of the game (so for example we could have a file with muliple columns for each game).
In [20]:
guesses = data.out()[0]
Here we create the game (as above):
In [21]:
g = twothirds.TwoThirdsGame(guesses)
In [22]:
g.find_winner()
Out[22]:
If we have a spreadsheet with multiple columns for guesses we can read it in and create an activity that will contain all the data and analysis we need. Let's tweak our guesses to have a second guess that should be lower than the first.
In [23]:
guesses = [[key, guesses[key], int(random.triangular(0, guesses[key], 1.0 * guesses[key] / 3))] for key in guesses]
Here we write the data to file again:
In [24]:
df = pandas.DataFrame(guesses)
df.to_csv('demo.csv', index=False)
In [25]:
activity = twothirds.Activity('demo.csv')
We have still got access to the raw data:
In [26]:
activity.raw_data.df.head()
Out[26]:
We also have an instance for each game:
In [27]:
activity.games
Out[27]:
The winning guess for each game can be found below:
In [28]:
[g.find_winner()[-1] for g in activity.games]
Out[28]:
The winners of the first game:
In [29]:
activity.games[0].find_winner()[:-1]
Out[29]:
The winners of the second game (there are more of them):
In [30]:
activity.games[1].find_winner()[:-1]
Out[30]:
The library has some inbuilt plots:
In [31]:
%matplotlib inline
activity.analyse()
activity.distplot();
activity.pairplot();
Finally you can see a summary of everything here:
In [32]:
activity
Out[32]:
Here is a larger example
In [33]:
import twothirds
activity = twothirds.Activity('data.csv')
In [34]:
activity.analyse()
In [35]:
%matplotlib inline
activity.distplot();
activity.pairplot();
In [ ]: