This notebook shows how to define intercepts that are uniform in the area allocated to each neuron, and shows that this improves decoder accuracy.
Distributing intercepts uniformly between -1 and 1 is clearly the right thing to do in 1-dimension. But does it also make sense in higher dimensions?
The intercept effectively defines what proportion of the represented area this neuron fires for. In 1-dimension, if we uniformly distribute the intercepts, then we also get a uniform distribution of represented area (i.e. a neuron with an intercept of 0 is active for 50% of the space, and a neuron with an intercept of 0.5 is active for 25% of the space). But, in 2-dimensions, an intercept of 0.5 will be active for less than 25% of the space. In higher dimensions, this gets even worse. Indeed, in 32 dimensions, we may end up with very large numbers of neurons that are either always on or always off, both of which are fairly useless.
In [1]:
%matplotlib inline
import pylab
import numpy as np
import nengo
import seaborn
import pytry
import pandas
We start by empirically computing this distribution, just by generating tuning curves and counting the number of non-zero values for each neuron.
In [18]:
def plot_intercept_distribution(ens):
pylab.subplot(1,2,1)
intercepts = ens.intercepts.sample(ens.n_neurons)
seaborn.distplot(intercepts, bins=20)
pylab.xlabel('intercept')
pylab.subplot(1,2,2)
pts = ens.eval_points.sample(n=1000, d=ens.dimensions)
model = nengo.Network()
model.ensembles.append(ens)
sim = nengo.Simulator(model)
_, activity = nengo.utils.ensemble.tuning_curves(ens, sim, inputs=pts)
p = np.mean(activity>0, axis=0)
seaborn.distplot(p, bins=20)
pylab.xlabel('proportion of pts neuron is active for')
In [3]:
for D in [1, 2, 4, 8, 16, 32]:
ens = nengo.Ensemble(n_neurons=10000, dimensions=D, add_to_container=False)
pylab.figure(figsize=(14,4))
plot_intercept_distribution(ens)
pylab.title('Dimensions = %d' % D)
Those peaks at 0% and 100% are very worrying! It looks like a lot of these neurons aren't really helping much.
To understand this a bit better, let's see if we can directly compute this mapping from intercept to proportion active. To do this, we need to compute the volume of a hyperspherical cap https://en.wikipedia.org/wiki/Spherical_cap#Hyperspherical_cap
The formula for the volume in general is $V = {1 \over 2} C_d r^d I_{2rh-h^2 \over r^2}({d+1 \over 2}, {1 \over 2})$ where $C_d$ is the volume of a unit hyperball of dimension $d$ and $I_x(a,b)$ is the regularized incomplete beta function.
In our case, if $x$ is the intercept, then $r=1$, $h=1-x$. We want the proportion, so we divide by $C_d$, leaving:
$p={1 \over 2} I_{1-x^2}({{d+1} \over 2}, {1 \over 2})$
Of course, this formula only works for $x>=0$. For $x<0$ we can flip the sign of $x$ and then subtract 1 at the end.
In [49]:
import scipy.special
def analytic_proportion(x, d):
flip = False
if x < 0:
x = -x
flip = True
if x >= 1.0:
value = 0
else:
value = 0.5 * scipy.special.betainc((d+1)/2.0, 0.5, 1 - x**2)
if flip:
value = 1.0 - value
return value
print analytic_proportion(0.5, 2)
This indicates an intercept of 0.5 in 2-dimensons only fires for 19.5% of the represented area.
Let's compare this analytic proportion to the empirically estimated proportion, just to check if it works.
In [57]:
def plot_intercept_distribution(ens):
pylab.subplot(1,3,1)
intercepts = ens.intercepts
if isinstance(intercepts, nengo.dists.Distribution):
intercepts = intercepts.sample(ens.n_neurons)
seaborn.distplot(intercepts, bins=np.linspace(-1.2, 1.2, 25))
pylab.xlabel('intercept')
pylab.xlim(-1.5, 1.5)
pylab.subplot(1,3,2)
pts = ens.eval_points.sample(n=1000, d=ens.dimensions)
model = nengo.Network()
model.ensembles.append(ens)
sim = nengo.Simulator(model)
_, activity = nengo.utils.ensemble.tuning_curves(ens, sim, inputs=pts)
p = np.mean(activity>0, axis=0)
seaborn.distplot(p, bins=20)
pylab.xlabel('proportion of pts neuron is active for\n(sampled)')
p2 = [analytic_proportion(x, ens.dimensions) for x in intercepts]
pylab.subplot(1,3,3)
seaborn.distplot(p2, bins=20)
pylab.xlabel('proportion of pts neuron is active for\n(analytic)')
ens = nengo.Ensemble(n_neurons=10000, dimensions=16, add_to_container=False)
pylab.figure(figsize=(14,3))
plot_intercept_distribution(ens)
Those look pretty similar to me! This shows our analytic approach is working well.
Now, we need to reverse this effect. What we want is a system that we can give an p value, and it will find the x-intercept value that gives that proportion. Fortunately, scipy has an inverse for the beta function, so a bit of algebra on $p={1 \over 2} I_{1-x^2}({{d+1} \over 2}, {1 \over 2})$ gives us
$x = \sqrt{1-I^{-1}_{2p}({{d+1} \over 2}, {1 \over 2})}$
In [58]:
def find_x_for_p(p, d):
sign = 1
if p > 0.5:
p = 1.0 - p
sign = -1
return sign * np.sqrt(1-scipy.special.betaincinv((d+1)/2.0, 0.5, 2*p))
print find_x_for_p(0.7, 2)
Let's see what happens when we apply this transformation to the intercepts
In [59]:
ens = nengo.Ensemble(n_neurons=10000, dimensions=16, add_to_container=False)
intercepts = ens.intercepts.sample(n=ens.n_neurons, d=1)[:,0]
intercepts2 = [find_x_for_p(x_int/2+0.5, ens.dimensions) for x_int in intercepts]
ens.intercepts = intercepts2
pylab.figure(figsize=(14,8))
pylab.subplot(2, 2, 1)
seaborn.distplot(intercepts)
pylab.xlabel('original intercepts')
pylab.subplot(2, 2, 2)
seaborn.distplot([analytic_proportion(x, ens.dimensions) for x in intercepts])
pylab.xlabel('proportion for original intercepts')
pylab.subplot(2, 2, 3)
seaborn.distplot(intercepts2)
pylab.xlabel('new intercepts')
pylab.subplot(2, 2, 4)
seaborn.distplot([analytic_proportion(x, ens.dimensions) for x in intercepts2])
pylab.xlabel('proportion for new intercepts')
Out[59]:
The new intercept distribution does what we want it to do! It results in a uniform distribution of the proportion of the represented area that is active for each neuron.
Let's confirm this in multiple dimensions.
In [60]:
for D in [1, 2, 4, 8, 16, 32]:
ens = nengo.Ensemble(n_neurons=10000, dimensions=D, add_to_container=False)
intercepts = ens.intercepts.sample(n=ens.n_neurons, d=1)[:,0]
intercepts2 = [find_x_for_p(x_int/2+0.5, ens.dimensions) for x_int in intercepts]
ens.intercepts = intercepts2
pylab.figure(figsize=(14,4))
plot_intercept_distribution(ens)
pylab.title('Dimensions = %d' % D)
Okay, this formula works! Now, does it improve decoder accuracy?
To help check this, let's start by making a nengo.dists.Distribution that does this transformation for us. It takes any intercept distribution and does the above transformation. Note that we do a quick substitution of $p = {{x_{intercept} + 1} \over 2}$ to turn an intercept into a desired probability. This changes the $2p$ to $x+1$.
In [70]:
class AreaIntercepts(nengo.dists.Distribution):
dimensions = nengo.params.NumberParam('dimensions')
base = nengo.dists.DistributionParam('base')
def __init__(self, dimensions, base=nengo.dists.Uniform(-1, 1)):
super(AreaIntercepts, self).__init__()
self.dimensions = dimensions
self.base = base
def __repr(self):
return "AreaIntercepts(dimensions=%r, base=%r)" % (self.dimensions, self.base)
def transform(self, x):
sign = 1
if x > 0:
x = -x
sign = -1
return sign * np.sqrt(1-scipy.special.betaincinv((self.dimensions+1)/2.0, 0.5, x+1))
def sample(self, n, d=None, rng=np.random):
s = self.base.sample(n=n, d=d, rng=rng)
for i in range(len(s)):
s[i] = self.transform(s[i])
return s
Now let's define a pytry.Trial to let us explore the accuracy of the system. As we vary the number of dimensions, we will see how well we can decode a constant function ($1$), linear value ($x_i$), squared value ($x_i^2$), and other quadratic combinatons ($x_i y_i$).
In [73]:
class AreaInterceptTrial(pytry.Trial):
def params(self):
self.param('number of neurons per dimension', n_per_d=50)
self.param('number of dimensions', d=1)
self.param('use AreaIntercept distribution', use_area=False)
def evaluate(self, p):
model = nengo.Network(seed=p.seed)
intercepts = nengo.dists.Uniform(-1, 1)
if p.use_area:
intercepts = AreaIntercepts(p.d, base=intercepts)
n_neurons = p.n_per_d * p.d
with model:
ens = nengo.Ensemble(n_neurons, p.d, intercepts=intercepts)
def func_constant(x):
return 1
constant = nengo.Node(size_in=1)
c_constant = nengo.Connection(ens, constant, function=func_constant)
def func_linear(x):
return x
linear = nengo.Node(size_in=p.d)
c_linear = nengo.Connection(ens, linear, function=func_linear)
def func_square(x):
return x**2
square = nengo.Node(size_in=p.d)
c_square = nengo.Connection(ens, square, function=func_square)
def func_quad(x):
r = []
for i, xx in enumerate(x):
for j, yy in enumerate(x[i+1:]):
r.append(xx*yy)
return r
count = len(func_quad(np.zeros(p.d)))
if count > 0:
quad = nengo.Node(size_in=count)
c_quad = nengo.Connection(ens, quad, function=func_quad)
else:
c_quad = None
sim = nengo.Simulator(model)
return dict(
constant = np.mean(sim.data[c_constant].solver_info['rmses']),
linear = np.mean(sim.data[c_linear].solver_info['rmses']),
square = np.mean(sim.data[c_square].solver_info['rmses']),
quad = np.mean(sim.data[c_quad].solver_info['rmses']) if c_quad is not None else 0,
)
Now let's check to see how it performs
In [75]:
seed = 1
print AreaInterceptTrial().run(d=16, verbose=False, seed=seed)
print AreaInterceptTrial().run(d=16, verbose=False, use_area=True, seed=seed)
This looks like an improvement in all cases except for the constant! Let's see if this holds up for other dimensions (and averaging over many runs with different seeds).
In [78]:
for seed in range(30, 40): # be sure to change this range each time you run it
# or it will re-generate the same data as before
# (the data is being stored in the directory "decode2")
print 'seed', seed
for d in [1, 2, 4, 8, 16, 32]:
AreaInterceptTrial().run(d=d, verbose=False, seed=seed, data_dir='decode2')
AreaInterceptTrial().run(d=d, verbose=False, seed=seed, data_dir='decode2', use_area=True)
Now let's load up this data and plot it.
In [6]:
df = pandas.DataFrame(pytry.read('decode2'))
pylab.figure()
seaborn.barplot('d', 'constant', hue='use_area', data=df)
pylab.ylim(0, 0.06)
pylab.title('$y=1$', fontsize=20)
pylab.figure()
seaborn.barplot('d', 'linear', hue='use_area', data=df)
pylab.ylim(0, 0.06)
pylab.title('$y=x_i$', fontsize=20)
pylab.figure()
seaborn.barplot('d', 'square', hue='use_area', data=df)
pylab.ylim(0, 0.06)
pylab.title('$y=x_i^2$', fontsize=20)
pylab.figure()
seaborn.barplot('d', 'quad', hue='use_area', data=df)
pylab.ylim(0, 0.06)
pylab.title('$y=x_i y_i$', fontsize=20)
pylab.show()
The result is consistent! This intercept distribution usually improves things! It does make the constant term a little bit worse, and it looks like the linear case in 2 and 4 dimensions seems to get a little bit worse as well. But it can get a pretty big improvement at higher dimensions!
In [ ]: