In [333]:

    
import numpy as np
from scipy import linalg
from scipy import optimize
import matplotlib.pyplot as plt
%matplotlib inline

Parameter Estimation in Poisson Processes

Let $Y(t)$ be a non-homogeneous Poisson process on $[0,T]$ with a conditional intensity $\lambda(t)$, and cumulative intensity $\Lambda(t) = \int_0^t\lambda(t)dt$. Then the number of events occuring before time $t$ follows a Poisson distribution and $$P(Y(t)=N) = e^{-\Lambda(t) }\Lambda(t)^N/N! $$

The conditional intensity function can be written as a function of a set of covariates and parameters, for example: $$ \lambda(t) = \exp(X(t)\beta)$$

The exponential above can be substituted with a more general 'inverse-link' function: $$\lambda_{\beta}(t) = f(X(t)\beta)$$

We can simulate from this process:



In [334]:

    
N = 10000# number of observations 
d = 5 # number of covariates

1) create a vector of $d$ parameters:



In [335]:

    
theta = np.random.normal(size = (d,))

2) Create an array of covariates:



In [336]:

    
X = 0.1*np.random.normal(size = (d,N))
# X = linalg.orth(X.T).T
# X = np.eye((d))

3) Construct the conditional intensity $\lambda(t)$



In [337]:

    
l = np.exp(np.dot(X.T,theta))



In [338]:

    
dt = 0.001 # discretization step

Generate the random variables:



In [339]:

    
u = np.random.uniform(size = len(l))
y = 1*(l*dt>u)



In [340]:

    
fig, ax = plt.subplots(figsize=(12, 2))
fig = ax.matshow(np.reshape(y,(1,len(y))),cmap = 'Greys',aspect = 3)

Given events at times $t_1,...,t_N$ the conditional likelihood is $$P_{\beta}(t_1,...,t_N) = e^{-\Lambda(T)} \prod_{i=1}^N\lambda_{\beta}(t_i)$$.

The Maximum Likelihood Estimate for $\beta$ requires maximizing P with respect to $\beta$, but usually it is easier to minimize the log of this function.

The corresponding negative log-likelihood is $$ L(\beta,t_1,...,t_N) = \Lambda(T) -\sum_{i=1}^N \log(\lambda_{\beta}(t_i))$$

Now let's discretize the domain of $[0,T]$ into n bins (small enough that there is only one spike per bin) $\tau_1,...,\tau_n$. And let $y_i = 1$ if there is a spike in the bin, and $y_i = 0$ otherwise.

$$ L(\beta,y) = \sum_{i=1}^n \lambda(\tau_i)\Delta \tau - \sum_{i=1}^n y_i \log\lambda(\tau_i)$$



In [341]:

    
def negLogL(theta,X,y,dt):
        # calculate the intensity of the Poisson process
        intensity = np.exp(np.dot(X.T,theta))
        # bins with spikes vs no spikes
        l = sum(intensity)*dt - sum(y*np.log(intensity))
        return(l)

The gradient is

$$\frac{dL}{d\beta} = \sum_{i=1}^n\frac{d\lambda(\tau_i)}{d\beta} \Delta \tau - \sum_{i=1}^n \frac{y_i}{\lambda(\tau_i)} \frac{d\lambda(\tau_i)}{ d\beta } $$

For a general link function the gradient becomes

$$ \frac{dL}{d\beta} = \sum_{i=1}^n f'(X(\tau_i)\beta)X(\tau_i)\Delta \tau - \sum_{i=1}^n \frac{y_i}{f(X(\tau_i)\beta)} f'(X(\tau_i)\beta) X(\tau_i) $$



In [342]:

    
def grad(theta,X,y,dt):
        intensity = np.exp(np.dot(X.T,theta))
        g = np.dot(X,intensity)*dt - np.dot(X,y) 
        return(g)

Set an initial guess:



In [343]:

    
theta_0 = np.random.normal(size = (d,))

Maximize the likelihood:



In [344]:

    
opts = {'disp':True,'maxiter': 1000}



In [345]:

    
theta_MLE = optimize.minimize(negLogL, theta_0,jac = grad, args = (X,y,dt),options = opts).x









    



Optimization terminated successfully.
         Current function value: 8.793364
         Iterations: 15
         Function evaluations: 16
         Gradient evaluations: 16



In [346]:

    
print ('Error: ' + str(np.sqrt(sum((theta_MLE - theta)**2))))









    



Error: 3.87380518036



In [347]:

    
theta_MLE = optimize.minimize(negLogL, theta_0, args = (X,y,dt),options = opts).x









    



Optimization terminated successfully.
         Current function value: 8.793364
         Iterations: 14
         Function evaluations: 105
         Gradient evaluations: 15



In [348]:

    
print ('Error: ' + str(np.sqrt(sum((theta_MLE - theta)**2))))









    



Error: 3.87386442791



In [ ]:



In [ ]:



In [349]:

    
plt.hold(False)
plt1, = plt.plot(theta,'ro',label = 'theta true')
plt.hold(True)
plt2, = plt.plot(theta_MLE,'o',label = 'theta MLE')
plt.legend(handles = [plt1,plt2])









    Out[349]:





<matplotlib.legend.Legend at 0x10c1d7860>

Calculate a prediction of the response.



In [350]:

    
l_MLE = np.exp(np.dot(X.T,theta_MLE))
u = np.random.uniform(size = len(l_MLE))
y_predicted = 1*(l_MLE*dt>u)



In [ ]:



In [ ]:



In [351]:

    
fig, ax = plt.subplots(figsize=(12, 2))
fig = ax.matshow(np.reshape(y,(1,len(y))),cmap = 'Greys',aspect = 3)
fig, ax = plt.subplots(figsize=(12, 2))
fig = ax.matshow(np.reshape(y_predicted,(1,len(y))),cmap = 'Greys',aspect = 3)



In [352]:

    
mismatch_error = sum(np.abs(y-y_predicted))/sum(y+y_predicted)
print(mismatch_error)

1.0



In [353]:

    
fig = plt.figure(figsize = (12,2))
fig = plt.plot(l)
plt.title('True Intensity')









    Out[353]:





<matplotlib.text.Text at 0x10c87c550>



In [354]:

    
fig = plt.figure(figsize = (12,2))
fig = plt.plot(l_MLE)
plt.title('Estimated Intensity')









    Out[354]:





<matplotlib.text.Text at 0x10c2322b0>



In [ ]: