In [1]:
def first_digit(number):
return int(str(number)[0])
That was exciting
In [2]:
first_digit(100)
Out[2]:
In [3]:
first_digit(399)
Out[3]:
Now, we're going to simulate picking numbers out of a hat, doing it 'runs' times. The bucket_size is the number of nubmer in the hat.
In [4]:
import random
def do_drawing(bucket_size, runs):
digits = [first_digit(random.randint(1,bucket_size)) for x in range(runs)]
return digits
Now, we'll pick a hundred times for all hat sizes from 1 to 9999
In [5]:
import collections
counters=[(top_end, collections.Counter(do_drawing(top_end, 100))) for top_end in range(1,10000)]
Now, we'll import the plotting stuff - I don't know why this warning appears for me.
In [6]:
import matplotlib.pyplot as plt
In [7]:
%matplotlib inline
Now, we'll grab and plot tuples of (number of items in hats, chance of pulling a number starting with 1) for all hat sizes
In [8]:
one_odds = [counter[1][1] for counter in counters]
In [9]:
plt.semilogx(range(len(counters)), one_odds)
Out[9]:
And we'll do the same for numbers starting with 2
In [10]:
two_odds = [counter[1][2] for counter in counters]
In [11]:
plt.semilogx(range(len(counters)), two_odds)
Out[11]:
And we'll plot them on top of each other. You'll see the stochastic approach shows expected results according to
In [12]:
plt.semilogx(range(len(counters)), one_odds, 'b', range(len(counters)), two_odds, 'r')
Out[12]:
We start to see experimental confirmation of the Benford's law distribution graph. The rest is an excercise for the reader/contributor ;)