SYDE 556/750: Simulating Neurobiological Systems

Accompanying Readings: Chapter 6

Terry Stewart

Decoder Analysis

  • What functions $f(x)$ can we accurately compute from a group of neurons?
  • Examples:

$f(x)=x$


In [6]:
import syde556
a = syde556.Ensemble(neurons=50, dimensions=1)

x = numpy.linspace(-1, 1, 100)
ideal = x
d = a.compute_decoder(x, ideal, noise=0.2)
A, xhat = a.simulate_rate(x, d)
print 'RMSE:', sqrt(mean((ideal-xhat)**2))
subplot(2,1,1)
plot(x, ideal, color='k')
plot(x, xhat, color='b')
subplot(2,1,2)
plot(x, A)
show()


RMSE: 0.0141812072763
  • What about $f(x)=10x$? or larger numbers?

$f(x)=x^2$


In [19]:
import syde556
a = syde556.Ensemble(neurons=50, dimensions=1)

x = numpy.linspace(-1, 1, 100)
ideal = x**2
d = a.compute_decoder(x, ideal, noise=0.2)
A, xhat = a.simulate_rate(x, d)
print 'RMSE:', sqrt(mean((ideal-xhat)**2))
subplot(2,1,1)
plot(x, ideal, color='k')
plot(x, xhat, color='b')
subplot(2,1,2)
plot(x, A)
show()


RMSE: 0.0408841434153
  • Other exponents?

    • $f(x)=x^3$
    • $f(x)=x^4$
    • $f(x)=x^5$
    • $f(x)=x^0$
  • Polynomials?

    • If it's fairly good at $f(x)=2x$ and fairly good at $f(x)=x^2$, what happens with $f(x)=x^2+2x$?
  • Other functions?

$f(x)=e^{- (x-b)^2/(2c^2)}$


In [39]:
import syde556
a = syde556.Ensemble(neurons=100, dimensions=1)

x = numpy.linspace(-1, 1, 100)

c = 0.5
b = 0
ideal = exp(-(x-b)**2/(2*c**2))
d = a.compute_decoder(x, ideal, noise=0.2)
A, xhat = a.simulate_rate(x, d)
print 'RMSE:', sqrt(mean((ideal-xhat)**2))
subplot(2,1,1)
plot(x, ideal, color='k')
plot(x, xhat, color='b')
subplot(2,1,2)
plot(x, A)
show()


RMSE: 0.0292945716158
  • What about different values of $c$?

The Decoder Calculation

  • What are decoders good at?
  • How do we compute decoders?
    • Here's the simple noise-free case:
    • $\Gamma=A^T A$ and $\Upsilon=A^T f(x)$
    • $ d = \Gamma^{-1} \Upsilon $
  • Consider the $\Gamma$ matrix
    • This is known as a Gram matrix
    • Measure of similarity of neural responses (correlation matrix)
    • Tells us about the representation of the vector space by the neurons
  • Stepping through the decoder calculation:
    • ideally, $X = A d$
    • $A^T X = A^T A d$
    • $(A^T A)^{-1} A^T X = d$
  • In this ideal case, how do we take the inverse correctly?
    • This is likely to be a singular matrix (lots of neurons have similar tuning curves)
    • Less of an issue normally, since we take noise into account and add onto the diagonal
    • Instead, let's do SVD to decompose and approximately invert the matrix
  • Singular Value Decomposition
    • In general, $B_{M,N}=U_{M,M} S_{M,N} V^T_{N,N}$ where $S$ is a diagonal matrix of singular values
    • But here, $\Gamma$ is square and symmetric, so $\Gamma=U S U^T$ and the inverse of $\Gamma$ can be written as $U S^{-1} U^T$
    • When $\Gamma$ in singular (or nearly so), then some elements of the diagonal of $S$ are zero (or very small), so the inverse has values that are infinite (or very large)
    • The "Pseudo-inverse" is to say if $S_i=0$, then $S^{-1}_i=0$

Usefulness of SVD

  • Regardless of the function we want to decode, we always invert the same $\Gamma$ matrix
    • identity decoder is just the special case $f(x)=x$
    • Understanding properties of $\Gamma$ can provide insight into all possible decodings from the populations's activity $A$
  • The singular values $S$ tell us the importance of a particular $U$ vector, including
    • the error if that $U$ vector was left out of the mapping
    • relative to the variance of the population firing
    • the amount of independent information about the population firing that could be extracted if we were only looking at the data projected onto that $U$ vector.
    • In general, we can think of the magnitude of the singular value as telling us how relevant that $U$ vector is to the identity of the matrix we have decomposed
    • Since the matrix we have decomposed is like the correlation matrix of the neuron tuning curves, the large singular values are the most important for accounting for the structure of those correlations
  • The vectors in $U$ are orthogonal
    • they are an ordered, orthogonal basis
    • The original $\Gamma$ matrix was generated by a non-ordered, non-orthogonal basis: the neuron tuning curves
  • Consider the neural activity for a particular $x$ value
    • A row of the $A$ matrix
    • The "activity space" of the neurons
    • Because the neural responses are non-independently driven by some variable $x$, only a small part of the subspace of possible neural activity values is ever reached by the population
    • the $\Gamma$ matrix, because it tells us the correlations between all neurons in the population, provides the information needed to determine that subspace
    • That subspace is $U$
  • So what?
    • So, let's rotate our neural activity curves $A$ into this subspace and see what they look like.
    • $\chi = A U$
    • These will be the orthogonal functions we can decode, ordered by accuracy

In [95]:
import syde556
a = syde556.Ensemble(neurons=300, dimensions=1, seed=0)

x = numpy.linspace(-1, 1, 100)
ideal = x
d = a.compute_decoder(x, ideal, noise=0.2)
A, xhat = a.simulate_rate(x, d)

Gamma = numpy.dot(A.T, A)
U,S,V = numpy.linalg.svd(Gamma)

chi = numpy.dot(A, U)

for i in range(5):
    plot(x, chi[:,i], label='$\chi_%d$=%1.3g'%(i, S[i]), linewidth=3)
legend(loc='best', bbox_to_anchor=(1,1))    
show()


  • Does anybody recognize those functions?

In [97]:
subplot(1,2,1)
for i in range(5):
    plot(x, chi[:,i], linewidth=3)

subplot(1,2,2)
for i in range(5):
    plot(x, numpy.polynomial.legendre.legval(x, numpy.eye(5)[i]))
show()


  • The Legendre polynomials
    • Or at least scaled version thereof
  • So we should be able to approximate any function that can be approximated by polynomials
  • But remember that they are ordered by their singular values
  • See how quickly the singular values drop off:

In [92]:
plot(S)
show()


  • If you need higher and higher polynomials, you need many more neurons

Modifying that basis

  • Are we stuck with this?
  • What can we do to modify these bases functions?

In [40]:
import syde556
a = syde556.Ensemble(neurons=100, dimensions=1, seed=0, 
                     intercept=(-0.3, 0.3))

x = numpy.linspace(-1, 1, 100)
ideal = x
d = a.compute_decoder(x, ideal, noise=0.2)
A, xhat = a.simulate_rate(x, d)

Gamma = numpy.dot(A.T, A)
U,S,V = numpy.linalg.svd(Gamma)

chi = numpy.dot(A, U)

figure()
plot(x, A)

figure()
for i in range(6):
    plot(x, chi[:,i], label='$\chi_%d$=%1.3g'%(i, S[i]), linewidth=3)
legend(loc='best', bbox_to_anchor=(1,1))    
show()


  • What about other distributions of intercepts?
  • Changing these distributions also affects how much it gets better with more neurons

  • Note that the accuracy curves are for $f(x)=x$

Higher dimensions

  • Does this apply to vectors as well as scalars?
    • Of course
    • Same math, even
    • Just a little harder to plot

In [36]:
import syde556
a = syde556.Ensemble(neurons=300, dimensions=2, seed=0)

x0 = numpy.linspace(-1,1,50)
x1 = numpy.linspace(-1,1,50)

x0, x1 = numpy.array(meshgrid(x0,x1))
x = numpy.array([numpy.ravel(x0), numpy.ravel(x1)])

d = a.compute_decoder(x, x, noise=0.2)
A, xhat = a.simulate_rate(x, d)

Gamma = numpy.dot(A.T, A)
U,S,V = numpy.linalg.svd(Gamma)

chi = numpy.dot(A, U)

index = 0
basis = chi[:,index]
basis.shape = 50,50

from mpl_toolkits.mplot3d.axes3d import Axes3D
fig = pylab.figure()
ax = fig.add_subplot(1, 1, 1, projection='3d')
p = ax.plot_surface(x0, x1, basis, linewidth=0, cstride=1, rstride=1, cmap=pylab.cm.jet)
pylab.title('Basis #%d (sv=%1.4g)'%(index, S[index]))
show()


  • Again, we get the polynomials
  • In general, we get cross-terms in the computable functions
  • For the 2D case:
    • $f(x)=a_0 + a_1 x_0 + a_2 x_1 + a_3 x_0^2 + a_4 x_1^2 + a_5 x_0 x_1 + ...$

  • The grey line is constant for as many terms as should be at about the same accuracy level
  • In higher dimensions, higher order functions are even harder

  • This is all also going to be affected by distributions of encoders (as well as distributions of intercepts and maximum rates)

  • Given a function, can we automatically find good choices for tuning curves?
    • In general, this is a difficult problem
    • Indeed, we can cast this as exactly the standard multi-layer neural network learning problem
    • But analyzing it this way gives us hints
  • General techniques
    • If your function has a big zero area, see if you can get the tuning curves to also be zero in that area
    • Smooth, broad functions are best
    • For the special case of $f(x)=x_0 x_1$, arrange your encoders at $[1,1], [1,-1], [-1,-1], [-1, 1]$

In [ ]: