In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
Many processes in physics and chemistry happen randomly or stochastically, whic. This is in contrast to deterministic problems where we can exactly predict the future based if we now the current state of the system. For example, we can predict the velocity of object if we know the initial velocity and the forces it feels. However, in many cases we only know the probabilities of different outcomes. For example, when we roll a die, we don't know which face of the die will end up facing the ground, we only know the *probability* of seeing any of of the faces.
In the case of the die, there are six different outcomes and they all have equal probability (if it is an unbiased die). In general, if there are $N$ outcomes, each with probability $p_i$ then the probabilities must be normalized, $$\sum\limits_{i}^N p_i = 1$$ In other words, the probability that something happens is $1$.
As a first example, consider the trajectory of a speck of dust floating in a cup of water. For the moment let's imagine that the dust can only move in one dimension. Since the dust is bombarded by water molecules in a random way we cannot know it's trajectory exactly. The trajectory is random or stochastic: the dust takes a step to the right $x > 0$ with probability $p$ and to the left $x < 0$ with probabilty $q$.
In [3]:
p = 0.5
q = 1. - p
Let's take a look at the trajectory of the fly for a large number of steps.
In [4]:
Nsteps = 10000
x = np.zeros(Nsteps)
for t in range(1,Nsteps):
if np.random.rand() <= p:
x[t] = x[t - 1] + 1
else:
x[t] = x[t - 1] - 1
print "mean: %.2f" % np.mean(x)
print "std. dev.: %.2f" % np.std(x)
plt.plot(x)
plt.title("Random Walk")
plt.xlabel("time $t$")
plt.ylabel("position $x(t)$")
plt.show()
Notice that if you rerun the previous cell you will see a different trajectory each time. What if we were observing the random walks of many independent specks of dust, what would be properties of the set of random walks?
We can derive the probability distribution of outcomes using the probability to step in either direction. By choosing.
Let's generate some trajectories and see how they relate to the formula we have derived.
In [5]:
Nsteps = 10000
Nwalkers = 1000
xmany = np.zeros((Nwalkers,Nsteps))
for n in range(Nwalkers):
for t in range(1,Nsteps):
if np.random.rand() <= p:
xmany[n,t] = xmany[n,t - 1] + 1
else:
xmany[n,t] = xmany[n,t - 1] - 1
#print "mean: %.2f" % np.mean(x)
#print "std. dev.: %.2f" % np.std(x)
We see that the trajectories are initially clustered around the origin and spread out as time proceeds. By plotting them with some transperancy we can qualitatively tell that the density around the origin (xdecreases.
In [6]:
plt.plot(xmany.T,alpha=0.01,lw=2,color='k')
plt.title("Many random walkers")
plt.xlabel("time $t$")
plt.ylabel("position $x(t)$")
plt.show()
Even though the individual trajectories are random we can look at properites of the distribution of outcomes as a function of time $p(x,t)$.
In [17]:
# What is a nicer whay to show this? Analytic solution of the diffusion equation (Gaussian spreading out)?
for i in range(0,Nsteps,Nsteps/10):
plt.hist(xmany[:,i],histtype="stepfilled",alpha=0.1)
plt.xlabel("$x$")
plt.ylabel("$p(x)$")
plt.title("Evolution of the $p(x,t)$ for random walk")
Out[17]:
For example, we can see that the variance of the distribution $p(x,t)$ grows linearly with time:$$\langle x^2 \rangle = t $$
In [8]:
plt.plot(np.var(xmany,axis=0))
Out[8]:
This observation was pointed out by Einstein in his work on brownian motion where he derived the relation, $$\langle x^2\rangle = 2Dt$$, where $D$ is the diffusion coefficient that depends on the temperature $T$ and viscosity of the medium $$D \propto \frac{T}{\eta}$$
In [9]:
plt.plot(np.mean(xmany,axis=0))
Out[9]:
Since both directions are equally likely, we should get $\langle x\rangle\rightarrow 0$ as time goes to infinity. However, at any finite time we may get $\langle x \rangle\ne 0 $. Let's look at the distribution of $\langle x\rangle$ for each of the random walks we calculated above.
In [16]:
plt.hist(np.mean(xmany,axis=1),40,histtype="step",lw=2)
plt.xlabel("Average position over random walk $\\langle x\\rangle$")
plt.ylabel("counts")
plt.title("Probability of getting average position")
Out[16]:
Our random walk model applies equally well to a series of fair coin flips, where the $x$ position represents the difference between heads and tails, $$x = n_{heads} - n_{tails}$$
By looking at the distribution of $\langle x\rangle$ for many random walks (now many series of coin flips) we see the probability of getting a heads tails deficit in a series of coin flips. A gambler may be inclined to bet on tails if they observed a series of many heads in a row, thinking that the outcomes of a fair coin must "even out" to give equal number of heads and tails. However, we can see from the plotted distribution why this logic is wrong: since the variance of the distribution grows with time there is nothing wrong observing a large deficit is a string of coin tosses. The idea that short strings of observations should match the . This is commonly called the "gambler's fallacy".
In [ ]: