notebook.community

Edit and run



In [1]:

    
def first_digit(number):
    return int(str(number)[0])

That was exciting



In [2]:

    
first_digit(100)









    Out[2]:





1



In [3]:

    
first_digit(399)









    Out[3]:





3

Now, we're going to simulate picking numbers out of a hat, doing it 'runs' times. The bucket_size is the number of nubmer in the hat.



In [4]:

    
import random
def do_drawing(bucket_size, runs):
    digits = [first_digit(random.randint(1,bucket_size)) for x in range(runs)]
    return digits

Now, we'll pick a hundred times for all hat sizes from 1 to 9999



In [5]:

    
import collections
counters=[(top_end, collections.Counter(do_drawing(top_end, 100))) for top_end in range(1,10000)]

Now, we'll import the plotting stuff - I don't know why this warning appears for me.



In [6]:

    
import matplotlib.pyplot as plt



In [7]:

    
%matplotlib inline

Now, we'll grab and plot tuples of (number of items in hats, chance of pulling a number starting with 1) for all hat sizes



In [8]:

    
one_odds = [counter[1][1] for counter in counters]



In [9]:

    
plt.semilogx(range(len(counters)), one_odds)









    Out[9]:





[<matplotlib.lines.Line2D at 0x106745470>]

And we'll do the same for numbers starting with 2



In [10]:

    
two_odds = [counter[1][2] for counter in counters]



In [11]:

    
plt.semilogx(range(len(counters)), two_odds)









    Out[11]:





[<matplotlib.lines.Line2D at 0x106a37da0>]

And we'll plot them on top of each other. You'll see the stochastic approach shows expected results according to



In [12]:

    
plt.semilogx(range(len(counters)), one_odds, 'b', range(len(counters)), two_odds, 'r')









    Out[12]:





[<matplotlib.lines.Line2D at 0x107107a90>,
 <matplotlib.lines.Line2D at 0x10717f2b0>]

We start to see experimental confirmation of the Benford's law distribution graph. The rest is an excercise for the reader/contributor ;)