Observed data: Count vector $y : (y_1,y_2,y_3,y_4)^T$ postulated to be arising from a multinomial distribution with four cells and corresponding probabilities: $(p_1,p_2,p_3,p_4) = (\frac{1}{2}+\frac{1}{4}p, \frac{1}{4}(1-p), \frac{1}{4}(1-p), \frac{1}{4}p)$ such that $0 \leq p \leq 1$ and we need to estimate $p$
Probability density function $f(y;p) = \frac{n!}{y_1!y_2!y_3!y_4!}p_1^{y_1}p_2^{y_2}p_3^{y_3}p_4^{y_4}$
The log likelihood function is given by $\log(L(p|y)) = y_1\log(p_1) + y_2\log(p_2)+y_3\log(p_3)+y_4\log(p_4) + \log(n!) - \log(y_1!y_2!y_3!y_4!)$
In [15]:
import numpy as np
import scipy as sp
y1,y2,y3,y4 = 1997,906,904,32
n = 3839
b = y1-y4-2*(y2+y3)
a = -(y1+y2+y3+y4)
c = 2*y4
p = (-b-np.sqrt(b**2-4*a*c))/(2*a)
In [16]:
p
Out[16]:
Thus we choose starting point to be $p_0 = \frac{y_1+y_4-(y_2+y_3)}{n}$
NOTE: In order to reduce computational complexity we make this following inherent assumption: $$ E[\frac{\partial^2 S}{\partial p^2}] = E[\frac{\partial S}{\partial p} \times \frac{\partial S}{\partial p}] $$
In [101]:
pe = (y1+y4-(y2+y3))/n
pe
Out[101]:
In [134]:
pold = pe
pnew = pe
s = y1/(2+pe)-(y2+y3)/(1-pe)+y4/pe
print(s)
In [135]:
def I(px):
return y1/((2+px)**2)+(y2+y3)/((1-px)**2) + y4/(px**2)
def S(px):
return y1/(2+px)-(y2+y3)/(1-px)+y4/px
iters = 0
while np.abs(s)>10e-9 and pnew>0 and pold>0:
i = I(pnew)
s = S(pnew) - i *(pnew-pold)
pold = pnew
pnew = pold + 1/(I(pold))*S(pold)
print(pold, pnew, S(pold))
iters+=1
In [ ]: