In [1]:
def first_digit(number):
    return int(str(number)[0])

That was exciting


In [2]:
first_digit(100)


Out[2]:
1

In [3]:
first_digit(399)


Out[3]:
3

Now, we're going to simulate picking numbers out of a hat, doing it 'runs' times. The bucket_size is the number of nubmer in the hat.


In [4]:
import random
def do_drawing(bucket_size, runs):
    digits = [first_digit(random.randint(1,bucket_size)) for x in range(runs)]
    return digits

Now, we'll pick a hundred times for all hat sizes from 1 to 9999


In [5]:
import collections
counters=[(top_end, collections.Counter(do_drawing(top_end, 100))) for top_end in range(1,10000)]

Now, we'll import the plotting stuff - I don't know why this warning appears for me.


In [6]:
import matplotlib.pyplot as plt

In [7]:
%matplotlib inline

Now, we'll grab and plot tuples of (number of items in hats, chance of pulling a number starting with 1) for all hat sizes


In [8]:
one_odds = [counter[1][1] for counter in counters]

In [9]:
plt.semilogx(range(len(counters)), one_odds)


Out[9]:
[<matplotlib.lines.Line2D at 0x106745470>]

And we'll do the same for numbers starting with 2


In [10]:
two_odds = [counter[1][2] for counter in counters]

In [11]:
plt.semilogx(range(len(counters)), two_odds)


Out[11]:
[<matplotlib.lines.Line2D at 0x106a37da0>]

And we'll plot them on top of each other. You'll see the stochastic approach shows expected results according to


In [12]:
plt.semilogx(range(len(counters)), one_odds, 'b', range(len(counters)), two_odds, 'r')


Out[12]:
[<matplotlib.lines.Line2D at 0x107107a90>,
 <matplotlib.lines.Line2D at 0x10717f2b0>]

We start to see experimental confirmation of the Benford's law distribution graph. The rest is an excercise for the reader/contributor ;)