Content and Objectives

Show validity of the weak law of large numbers
First, convergence of sequences to expectation is shown
Second, probability of sequences is analyzed and implications are discussed
Finally, statistics for state at time T=10 are provided showing that most sequences are converging towards the mean, but other sequences are also possible, albeit being less likely

Import



In [1]:

    
# importing
import numpy as np

from scipy import stats, special
from decimal import Decimal

import matplotlib.pyplot as plt
import matplotlib

# showing figures inline
%matplotlib inline



In [2]:

    
# plotting options 
font = {'size'   : 20}
plt.rc('font', **font)
plt.rc('text', usetex=True)

matplotlib.rc('figure', figsize=(18, 6) )

Simulation of Sequence of Coins

NOTE: Showing 1000 sequences of length 1000



In [3]:

    
# number of points to be sampled    
len_sequence = 1000
N_sequences = 1000

# initialize storage for sequences
results = np.zeros( ( N_sequences, len_sequence ) )

# vector of lengths from 1 to len_sequence in order to perform averaging
lengths = np.arange( 1, len_sequence + 1 )

# loop for sequence length
for n in np.arange( N_sequences ):

    # sample sequence
    sequence = np.random.choice( [ 0 , 1 ], size = len_sequence, p = [ .5, .5 ] )

    # summing up and normalizing
    # NOTE: By doing so, occurences are always normalized to the length of the observation
    results[ n, :] = np.cumsum( sequence ) / lengths



In [4]:

    
# plotting
for n in np.arange( N_sequences ): 
    plt.plot( range(1, len_sequence+1), results[n, :], linewidth = 2.0 )    

plt.grid( True )
plt.xlabel('$N$')
plt.ylabel('$H_N($'+'Kopf'+'$)$')
plt.margins(.1)

Discussing Probability of Sequences

NOTE: Sequence length is reduced to 100 to get tractable numbers



In [5]:

    
# number of samples 
N = 100

# probability for sampling 1
p = 0.9

print( 'Model:' )
print( '------\n' )
print( '{} times indenpendently sampling a bit'.format( N ) )
print( 'P( 1 ) = {}\n\n'.format( p ) )

print( 'Results:' )
print( '--------\n' )

print( 'P( 11...11 ) \t\t\t = {}\n'.format( p**N ) )
print( 'P( 10 x 0, 90 x 1 ) \t\t = {}'.format( (1-p)**10 * p**(N-10) ) )
print( 'Binomial coefficient (N, 10) \t = {:.2e}'.format( Decimal( special.binom( N, 10 ) ) ) )
print( '''P( 10 x 0 'somewhere' ) \t = {}'''.format( special.binom( N, 10 ) * (1-p)**10 * p**(N-10) ) )









    



Model:
------

100 times indenpendently sampling a bit
P( 1 ) = 0.9


Results:
--------

P( 11...11 ) 			 = 2.6561398887587544e-05

P( 10 x 0, 90 x 1 ) 		 = 7.61773480458664e-15
Binomial coefficient (N, 10) 	 = 1.73e+13
P( 10 x 0 'somewhere' ) 	 = 0.13186534682448825

Again: Sequence of Coins

NOTE: Showing 2000 sequences of length 10



In [6]:

    
# number of points to be sampled    
len_sequence = 10
N_sequences = 2000

# initialize storage for sequences
results = np.zeros( ( N_sequences, len_sequence ) )

# vector of lengths in order to perform averaging
lengths = np.arange( 1, len_sequence + 1 )

# loop for sequence length
for n in np.arange( N_sequences ):

    # sample sequence
    sequence = np.random.choice( [ 0 , 1 ], size = len_sequence, p = [ .5, .5 ] )

    # summing up and normalizing
    results[ n, :] = np.cumsum( sequence ) / lengths



In [7]:

    
# plotting
for n in np.arange( N_sequences ): 
    plt.plot( range(1, len_sequence+1), results[n, :], linewidth = 2.0 )    

plt.grid( True )
plt.xlabel('$N$')
plt.ylabel('$H_N($'+'Kopf'+'$)$')
plt.margins(.1)


# now determine histogram for the occurrence of end points,
# showing that all points may be observed but likelihood is very different

# extract end-points
results_end = results[:, -1]

# get histogram
num_bins = 20
width = 2/num_bins
bins = np.linspace(0, 1, num_bins, endpoint=True)
r_hist = np.histogram( results_end, bins = bins, density = True )

plt.barh( r_hist[1][:-1], 0 + r_hist[0] / np.sum(r_hist[0]) * 5 , width, left=len_sequence+.1, color = '#ff7f0e' )









    Out[7]:





<BarContainer object of 19 artists>



In [ ]: