# Brownian motion of a trait through time

We take small discrete steps dt (e.g. a generation). At each step, we measure a trait. To model such a situation, we assume that the trait value V' at time t+dt is Normally distributed of mean V and variance 0.01. We take 100 steps.

``````

In [1]:

import pandas as pd
import numpy.random as rd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from ete3 import Tree, TreeStyle, NodeStyle

%matplotlib inline

``````
``````

In [2]:

def simulate(num):
times = range(num)
values=list()
values.append(0)
for i in times[1:(len(times))]:
values.append(rd.normal(loc=values[i-1], scale=0.01))
simu = pd.DataFrame()
simu['times']=times
simu['values']=values
return simu

``````
``````

In [3]:

simu = simulate(100)
sns.set_style("darkgrid")
plt.plot(simu["times"], simu["values"], 'r-')
plt.show()

``````
``````

``````

### What is the distribution of the trait after these 100 generations? Let's simulate lots of times and store the end result.

``````

In [4]:

end_values = list()
for i in range(2000):
end_values.append(simulate(100)['values'][99])

``````
``````

In [5]:

sns.distplot(end_values)

``````
``````

Out[5]:

<matplotlib.axes._subplots.AxesSubplot at 0x7fd42d8a9748>

``````

### Does it look like a Normal distribution? We can fit a Normal distribution to our data.

``````

In [6]:

from scipy.stats import norm
sns.distplot(end_values, fit=norm)

``````
``````

Out[6]:

<matplotlib.axes._subplots.AxesSubplot at 0x7fd4258d7358>

``````

### Looks quite close. What are the mean and standard deviation?

``````

In [7]:

print("Mean: "+str(np.mean(end_values)))
print("SD: "+ str(np.std(end_values)))

``````
``````

Mean: 0.00129882759788
SD: 0.101707400925

``````

## Simulating along a tree

``````

In [8]:

t = Tree("((((a:1,b:1):1,(c:1,d:1):1):1,((e:1,f:1):1,(g:1,h:1):1):1)30, (((i:1,j:1):1,(k:1,l:1):1):1,((m:1,n:1):1,(o:1,p:1):1):1):30);")

``````
``````

In [9]:

# Calculate the midpoint node
R = t.get_midpoint_outgroup()
# and set it as tree outgroup
t.set_outgroup(R)

ts = TreeStyle()
ts.min_leaf_separation= 0
#ts.scale = 2020
nstyle = NodeStyle()
nstyle["size"] = 0
for n in t.traverse():
n.set_style(nstyle)
t.render("%%inline", tree_style=ts)

``````
``````

Out[9]:

``````
``````

In [10]:

def simulate_on_tree():
root_value = 0.0
for n in t.traverse("preorder"):
if n != t.get_tree_root():

values_at_leaves=list()
names=list()
for leaf in t:
values_at_leaves.append(leaf.value)
names.append(leaf.name)
simu_tree = pd.DataFrame()
simu_tree['names']=names
simu_tree['values']=values_at_leaves
return simu_tree

``````
``````

In [11]:

simu_tree = simulate_on_tree()

sns.stripplot("names", "values", data=simu_tree)

``````
``````

Out[11]:

<matplotlib.axes._subplots.AxesSubplot at 0x7fd425863eb8>

``````

## What if we were to simulate 2 traits on the phylogeny?

``````

In [12]:

simu_tree_1 = simulate_on_tree()
simu_tree_2 = simulate_on_tree()
simu_tree_1_and_2 = simu_tree_1
simu_tree_1_and_2['values2']=simu_tree_2['values']
sns.stripplot("names", "values", data=simu_tree_1_and_2)

``````
``````

Out[12]:

<matplotlib.axes._subplots.AxesSubplot at 0x7fd424050ac8>

``````
``````

In [13]:

sns.stripplot("names", "values2", data=simu_tree_1_and_2)

``````
``````

Out[13]:

<matplotlib.axes._subplots.AxesSubplot at 0x7fd41db95080>

``````

### Are those two traits correlated?

``````

In [14]:

from scipy.stats import pearsonr

sns.lmplot("values", "values2", data=simu_tree_1_and_2)
print("Pearson correlation coefficient and p-value: "+ str(pearsonr(simu_tree_1_and_2['values'], simu_tree_1_and_2['values2'])))

``````
``````

Pearson correlation coefficient and p-value: (-0.59890110064598667, 0.01422995400342641)

``````
``````

In [ ]:

``````
``````

In [ ]:

``````