integrate log cdf and pdf

We want to solve the following integral in closed form:

$$ I_3 = \int_{-\infty}^\infty \log(\sigma(x)) \frac{1}{s} \phi\left( \frac{x - \mu} {s} \right) \, dx $$

subsitute in:

$$ x = \mu + st $$

Therefore:

$$ dx = s\,dt $$

and:

$$ t = \frac{x - \mu} {s} $$

For the limits, we have:

$$ x_1 = -\infty, x_2 = \infty $$

Then, in terms of $t$, and bearing in mind $s > 0$, we have:

$$ t_1 = -\infty, t_2 = \infty $$

Therefore:

$$ I_3 = \frac{1}{s} \int_{-\infty}^\infty \log(\sigma(\mu + st)) \phi(t) s \, dt $$
$$ =\int_{-\infty}^\infty \log(\sigma(\mu + st)) \phi(t) \, dt $$

Let's try differentiating with respect to $\mu$, as per 'fast dropout training':

$$ \partial_\mu I_3 = \int_{-\infty}^\infty \frac{\frac{\partial}{\partial \mu}\sigma(\mu + st)} {\sigma(\mu + st)} \phi(t) \, dt $$

Looking at:

$$ E_2 = \frac{\partial}{\partial \mu} \sigma(\mu + st) $$

Define $\mu + st = g(\mu)$. Then:

$$ E_2 = \frac{\partial}{\partial \mu} \sigma(g(\mu)) $$

And:

$$ \frac{\partial E_2}{\partial \mu} = \frac{\partial E_2}{\partial g} \frac{\partial g}{\partial \mu} $$
$$ = \sigma(g)(1 - \sigma(g))(1) $$
$$ = \sigma(\mu + st)(1 - \sigma(\mu+st)) $$

Therefore:

$$ \partial_\mu I_3 = \int_{-\infty}^\infty \frac{\sigma(\mu + st)(1 - \sigma(\mu+st))} {\sigma(\mu + st)} \phi(t) \, dt $$
$$ = \int_{-\infty}^\infty (1 - \sigma(\mu + st))\phi(t)\,dt $$

Note that by symmetry, $(1 - \sigma(x)) = \sigma(-x)$. So we can write:

$$ \partial_\mu I_3 = \int_{-\infty}^\infty \sigma( -\mu - st) \phi(t)\,dt $$

Per 'fast dropout training', we want to get the integral in the form:

$$ I_1 = \int_{-\infty}^\infty \sigma(x) \mathcal{N}(x \mid \mu, s^2)\,dx = \int_{-\infty}^\infty \sigma(x) \frac{1}{s} \phi\left(\frac{x - \mu}{s} \right)\,dx $$

Let's try substituting $x = \mu +st$ back again:

$$ \partial_\mu I_3 = \int_{-\infty}^\infty \sigma(-x) \phi\left( \frac{x - \mu}{s} \right) \frac{1}{s} \, dx $$
$$ = \frac{1}{s}\int_{-\infty}^\infty \sigma(-x)\phi\left( \frac{x - \mu}{s} \right) \, dx $$

Lets use the $\sigma(-x) = 1 - \sigma(x)$ identity again...

Therefore:

$$ \partial_\mu I_3 = \frac{1}{s}\int_{-\infty}^{\infty} \phi\left(\frac{x - \mu}{s}\right)\,dx - \frac{1}{s} \int_{-\infty}^\infty \sigma(x) \phi\left( \frac{x - \mu}{s} \right)\,dx $$
$$ = 1 - I_1 $$
$$ \approx 1 - \sigma\left( \frac{\mu} {\sqrt{1 + \pi s^2 / 8}} \right) $$
$$ = \sigma \left( \frac{-\mu}{\sqrt{1 + \pi s^2 / 8}} \right) $$

But this is the partial derivative of $I$ wrt $\mu$. So, we need to reintegrate this back up.

Looking at http://mathworld.wolfram.com/SigmoidFunction.html , the indefinite integral of $\sigma(x)$ is:

$$ \int \sigma(x)\,dx = \ln(1 + \exp(x)) + C $$

We want:

$$ I_3 \approx \int \sigma \left( \frac{-\mu} {\sqrt{1 + \pi s^2/8}} \right) \, d\mu + C $$

Lets substitute in:

$$ z = -\frac{\mu}{\sqrt{1 + \pi s^2/8}} $$

Therefore:

$$ dz = - \frac{d\mu}{\sqrt{1 + \pi s^2/8}} $$

and:

$$ d\mu = - \sqrt{1 + \pi s^2/8}\,dz $$

Therefore:

$$ I_3 \approx - \int \sigma(z) \sqrt{1 + \pi s^2 / 8 }\, dz + C $$
$$ = - \sqrt{1 + \pi s^2/8} \int \sigma(z) \, dz + C $$
$$ = - \sqrt{1 + \pi s^2/ 8} \log( 1 + \exp(z)) + C $$

And we have $z = - \frac{\mu}{\sqrt{1 + \pi s^2/8}}$. So:

$$ I_3 = - \sqrt{1 + \pi s^2/8} \log\left( 1 + \exp\left( - \frac{\mu}{\sqrt{1 + \pi s^2/8}} \right) \right) + C $$

... which is looking pretty close to equation (8) in the 'fast dropout training' paper :)

We have:

$$ \log(AB) = \log(A) + \log(B) $$

Subsitute $C = -B$, $A = 1$. Then we have:

$$ \log(-C) = \log(1) + \log(-C) = \log(-C) $$

Hmmm :P

Let's try:

$$ log(A/B) = \log(A) - \log(B) $$

Substitute $A = 1$:

$$ \log(1/B) = -\log(B) $$

Therefore:

$$ I_3 = \sqrt{1 + \pi s^2/8} \log\left( \frac{1} {1 + \exp(- \mu / \sqrt{1 + \pi s^2/8}} \right) + C $$
$$ = \sqrt{1 + \pi s^2 8} \log \left( \sigma \left( \frac{\mu}{\sqrt{1 + \pi s^2/8}} \right) \right) + C $$

Let's try solving for $C$, and see what happens. We need to find at least one known value, to 'fix' or 'ground' the $C$ value.

Let's look at the original integral again, before we differentiated it. It was:

$$ I_3 = \frac{1}{s} \int_{-\infty}^\infty \log(\sigma(x)) \, \phi \left( \frac{x - \mu }{s} \right) \, dx $$

... and we want to find some values for $\mu$ and $s$ that will give us a known value.

Let's set $s=1$, which basically makes it vanish from the expressions. Then looking at $\mu$, it only appears in the second $\phi(\cdot)$ term. And, as $\mu \rightarrow \infty$, or $\mu \rightarrow -\infty$, then the second term will tend to $0$. However, we have constrainted $\mu > 0$. Therefore, let's choose to tend $\mu \rightarrow \infty$. As $\mu \rightarrow \infty$, then, looking at other terms:

  • $1/s = 1$, since we are setting $s=1$
  • $\sigma(x) > 0$ almost everywhere for $x \in (-\infty, \infty)$
  • and $\sigma(x) < 1$ almost everywhere for $x \in (-\infty, \infty)$
  • therefore $0 < \log(\sigma(x)) < 1$ almost everywhere for $x \in (-\infty, \infty)$

Therefore:

$$ \lim_{\mu \rightarrow \infty} I_3 = 0 $$

Meanwhile, looking at our final expression for $I_3$, we have:

$$ I_3 = \sqrt{1 + \pi s^2} \log \left( \sigma \left( \frac{\mu} {\sqrt{1 + \pi s^2/8}} \right) \right) + C $$

As $\mu \rightarrow \infty$ $\sigma(\mu) \rightarrow 1$, and thus $\log(\sigma(\mu)) \rightarrow 0$.

Therefore, $\lim_{\mu \rightarrow \infty} I_3 = C$

Therefore, by comparison with the earlier result for $I_3$, based on the original integral, $C=0$.

Therefore we have:

$$ I_3 \approx \sqrt{1 + \pi s^2} \log \left( \sigma \left( \frac{\mu}{\sqrt{1 + \pi s^2/8}} \right) \right) $$

which matches the 'fast dropout training' paper :)