Langevin equation $dX=A(X,t)dt+B(X,t)dW(t)$ ...Eqn (10.1), where $X\in \mathbb{R}^n$, $A$ is an $n$-vector of drift coefficients, $B$ are diffusion coefficients ($n\times n$ matrix of functions), $W(t)$ is a vector of independent Wiener processes.
Solving numerically for scalar $X$, for a small time step $h$, $X(n+1)=X(n)+A(X(n),t)h+B(X(n),t)\sqrt{h}\hat{N}(0,1)$, where $\hat{N}(0,1)$ is a vector of normally ditributed independent random numbers with zero mean and unit variance.
Wiener process $W(t)$ is the limiting case of a random walk as the steps and the time between steps get smaller such that $(\Delta x)^2/\Delta t \to 1$. Can simulate a standard Wiener process by the iteration $W(t+h)=W(t)+\sqrt{h}\hat{N}(0,1)$ ...Eqn (10.3).
$I=\int_{t_0}^t G(s)dW(s)$, where $G(t)$ is a piece-wise continuous function. Approximating, $S(n)=\sum_{j=1}^{n}G(\tau_j)(W(t_j)-W(t_{j-1}))$. For a Riemann integral, any $t_j<\tau_j<t_{j-1}$ will do. But for stochastic integration, the choice of $\tau_j$ matters. Ito integral: $\tau_j=t_{j-1}$. Stratonovich integral: $\tau_j=(t_j+t_{j-1})/2$. In neural modeling, people prefer Stratonovich if noise has correlations and we take limit as correlation time goes to zero. If $B$ is constant in Eqn (10.1), then both are equivalent.
From Eqn (10.3), $E[(W(t+h)-W(t))^2]=h$. Thus $E[dW(t)^2]=dt$.
Suppose $dx=a(x,t)dt+b(x,t)dW(t)$, and $y=f(x)$ twice differentiable.
Then $dy=f(x+dx)-f(x)=f'(x)dx+1/2f''(x)dx^2+...=f'(x)(a(x,t)dt+b(x,t)dW(t))+1/2f''(x)(a(x,t)dt+b(x,t)dW(t))^2+...$
Keeping only first order derivatives, $dy=(f'(x)a(x,t)+1/2f''(x)b^2(x,t))dt + f'(x)b(x,t)dW(t)$ ...Eqn (10.5) Ito's formula.
In the deterministic part there is an extra $1/2f''(x)b(x,t)$ term, unlike usual chain rule. [But we used $E[dW(t)^2]=dt$, which is valid only on average; even under an integral this would not work?]
Let $P(x,t)$ be the probability that a random variable $X=x$ at time $t$. Assuming the random process is Markovian, and let $M(x',x,t)dt$ denote the rate at which the process at state $X=x'$ at time $t$ jumps to $X=x$ at time $t+dt$. [Shouldn't it be 'jumps in time interval $(t,t+dt)$'?]. Then, the master equation is $\partial P/{\partial t} = \int (M(x',x,t)P(x',t)-M(x,x',t)P(x,t))dx'$ ...Eqn(10.6).
Let $Q(b,a,t)=M(a,a+b,t)$. Rewriting Eqn (10.6), $\partial P/{\partial t} = \int (Q(x-x',x',t)P(x',t)-Q(x'-x,x,t)P(x,t))dx'$. Changing variables, $y=x-x'$, $\partial P/{\partial t} = \int (Q(y,x-y,t)P(x-y,t)-Q(-y,x,t)P(x,t))dy$. For the second term (integral), replace $y\to-y$, and then convert the $-dy$ to an interchange of limits and combine the integrals again to get $\partial P/{\partial t} = \int (Q(y,x-y,t)P(x-y,t)-Q(y,x,t)P(x,t))dy$.
Third-order Kramers-Moyal expansion is an approximation by expanding in y to second order:
$$\partial P/\partial t = \int {\left(-y\frac{\partial Q(y,x,t)P(x,t)}{\partial x} + \frac{y^2}{2}\frac{\partial^2 Q(y,x,t)P(x,t)}{\partial x^2}\right) dy}$$Define $\alpha_1(x,t)=\int dy y Q(y,x,t)$ and $\alpha_2(x,t)=\int dy y^2 Q(y,x,t)$. These are the mean and variance respectively of the jump size from $x$. Thus,
$$\partial P/\partial t = -\frac{\partial \alpha_1(x,t)P(x,t)}{\partial x} + \frac{1}{2}\frac{\partial^2 \alpha_2(x,t)P(x,t)}{\partial x^2}$$In a diffusive process, all odd moments ($y$,$y^3$,$y^5$,etc.) vanish, and all higher even moments turn out to be expressible in terms of the second moment. Thus, for a diffusive process, the Kramers-Moyal expansion is exact [But the factor $\alpha_2$ will be different from (multiple of?) just the second moment?].
For a scalar Langevin equation based on Eqn (10.1), the discretization is $x(t+h)=x(t)+a(x(t),t)h+b(x(t),t)\sqrt{h}\hat{N}(0,1)$. This can be seen as a jump process in steps of h, with mean jump size $hA(x,t)$ and variance $hB^2(x,t)$. Since the master equation has mean and variance as rate per unit time, we divide by $h$, and get (intuitive derivation) $\partial P/\partial t = -\frac{\partial a(x,t)P(x,t)}{\partial x} + \frac{1}{2}\frac{\partial^2 b^2(x,t)P(x,t)}{\partial x^2}$.
For an n-dim Langevin equation Eqn (10.1), $$ \partial P(X,t)/\partial t = -\sum_{i=1}^N\frac{\partial A_i(X,t)P(X,t)}{\partial x_i} + \sum_{i,j=1}^N \frac{1}{2}\frac{\partial^2 B(X,t)B^T(X,t)P(X,t)}{\partial x_i \partial x_j} $$
Scalar version. Stochastic differential equation: $dx = a(x,t)dt+b(x,t)dW$. Let $y=h(x)$, $h$ is twice differentiable. Using Ito's formula: $dy=(h'(x)a(x,t)+1/2h''(x)b^2(x,t))dt+h'(x)b(x,t)dW$. Taking expectations, $\frac{d}{dt}E[h(x)]=E[h'(x)a(x,t)]+E[h''(x)b^2(x,t)/2]$.
If $\rho(x,t)$ is the probability distribution for $x$, then using the definition of expectation, $\frac{d}{dt}\int h(x)\rho(x,t)dx = \int \left( h'(x)a(x,t) + h''(x)b^2(x,t)/2 \right)\rho(x,t)dx$. Integrating RHS by parts, $\frac{d}{dt}\int h(x)\rho(x,t)dx = \text{const1} - \int ( h(x)(a(x,t)\rho(x,t))' + h'(x)(b^2(x,t)\rho(x,t)/2)' ) dx = \text{const2} + \int ( -(a(x,t)\rho(x,t))' + (b^2(x,t)\rho(x,t)/2)'' ) dx$. Since this must hold for any $h(x)$, we get the Fokker-Planck equation: $\frac{\partial}{\partial t}\rho(x) = \frac{\partial}{\partial x}( -a(x,t)\rho(x,t) + \frac{\partial}{\partial x}(b^2(x,t)\rho(x,t)/2) )$ [By defining a probability current, one can write this as a conservation of probability].
$dx=a(x,t)dt+\sigma dW(t)$ ...Eqn (10.10). Integrating, $x(t) = x(0) + \int_0^t a(x,t)dt + \sigma\int_0^t dW(t) $. By assuming additive noise, we avoid issues about interpretation of the stochastic integral [Gardiner 2004]. The forward Fokker-Planck equation is $\partial P/\partial t = -\frac{\partial a(x,t)P(x,t)}{\partial x} + \frac{\sigma^2}{2}\frac{\partial^2 P(x,t)}{\partial x^2}$. Defining a probability current (dimensions of $dx/dt$) $J(x,t)=a(x,t)P(x,t)-\frac{\sigma^2}{2}\frac{\partial P}{\partial x}$ (has an active drift term and a diffusive term), we can re-write the F-P eqn as a conservation law of probability or a continuity equation: $\frac{\partial P}{\partial t} + \frac{\partial}{\partial x} J(x,t) = 0$ [Looks like one can also do this when $b(x,t)$ is not a constant]. But this assumes infinitesimal jumps, it has to be amended for finite jumps. Various boundary/initial conditions and finite jumps are discussed in the text.
Scalar with constant noise Eqn (10.10). What is the distribution of exit times from the domain [a,b] for the stochastic process? Suppose $x(0)=x$ and $p(x',t|x,0)$ is the probability that process is at $x'$ at time $t$. Then $G(x,t)\equiv \int_a^b dx'p(x',t|x,0)$ is the probability that the process is in the interval [a,b] at time $t$. This is related to spike crossing and inter-spike-intervals. [Gardiner 2004] showed that $G(x,t)$ satisfies the backward [?] F-P equation $\frac{\partial}{\partial t}G(x,t) = a(x,t)\frac{\partial}{\partial x}G(x,t) + \frac{\sigma^2}{2}\frac{\partial^2}{\partial x^2}G(x,t) $ ...Eqn (10.15).
If $\rho(x,t)dt$ is the probability that $x$ exits the domain in the interval $(t,t+dt)$, then $G(x,t)=\int_t^\infty \rho(x,t')dt'$ [The process can go out in $(0,t)$ and come back in the domain at time $t$. How is that taken into account?]. [The book requires that $a$ be independent of time. How is that needed here?]. Mean first passage time is $\left< T\right> (x)=\int_0^\infty t'\rho(x,t')dt'$. Higher-order moments: $\left< T^n \right> (x)=\int_0^\infty t'^n\rho(x,t')dt'$.
Exercise 6: Integrating by parts,
$\left< T\right> (x) = (t'^n\int_0^t \rho(x,t')dt')|_0^\infty - \int_0^\infty t'^{n-1}\int_0^{t'} \rho(x,t) dt dt' = (t'^n\int_0^t \rho(x,t')dt')|_0^\infty - \int_0^\infty t'^{n-1}(\int_0^\infty \rho(x,t) dt - \int_{t'}^\infty \rho(x,t) dt) dt'$.
$\left< T^n\right> (x) = (t'^n\int_0^t \rho(x,t')dt')|_0^\infty - \int_0^\infty t'^{n-1}\int_0^\infty t^{n-1}\rho(x,t) dt dt' + \int_0^\infty t'^{n-1}\int_{t'}^\infty \rho(x,t) dt dt'$.
After cancelling the first two 'same infinity' terms (!) and replacing the definition of $G(x,t)$,
$\left< T^n\right> (x) = \int_0^\infty t'^{n-1}G(x,t')dt'$.
Multiplying by $t^{n-1}$ and integrating Eqn (10.15), LHS by parts, $t^{n-1}G(x,t)|_0^\infty - \int_0^\infty t^{n-2}G(x,t)dt = \int_0^\infty a(x,t)t^{n-1}\frac{\partial}{\partial x}G(x,t)dt + \frac{\sigma^2}{2}\frac{\partial^2}{\partial x^2} \left< T^n\right> ''(x)$ . Now, $G(x,\infty)=0$.
If $a$ is independent of time (see 2 paras above also), then $-1=a(x)\left< T\right>'(x) + \frac{\sigma^2}{2}\left< T\right>''(x)$.
In [ ]: