This Jupyter notebook contains code for learning a continuous attractor grid cell network for navigation in a 1D environment, based on a paper by Widloski & Fiete. The experiments in this notebook can all be freely modified (play with the parameters!). If you want to go even deeper, the Python code implementing the network can be obtained from the Htmresearch repository.

- Introduction & Background
- Learning
- Weight Patterns

- Simulation
- Assessment
- Conclusion

Grid cells, located in the entorhinal cortex, are hypothesized to form the basis of the brain's system for navigation (If you're not familiar with grid cells at all, try reading this article by the New York Times). Although grid cells have been an extremely active area of research for over a decade, precisely how function and learn remains something of an open question. Today, one of the best-known models for grid cells is the continuous attractor network (CAN) model demonstrated by Burak & Fiete, 2009, in which properly-structured recurrent connections between grid cells allow them to accurately keep track of changes in an animal's location even without sensory cues, a process known as path integration. In a later paper, Widloski & Fiete, 2014 proposed a model for how such a network might be learned, using nothing more complex than Hebbian learning in a familiar home environment.

This network, which we implement, has three different populations of neurons, two excitatory and one inhibitory. The excitatory neurons are broken down into one group hardwired to prefer "left" movement, and another that is hardwired to prefer "right" movement. It lacks connections between excitatory neurons; all CAN dynamics are based on inhibition.

In this formulation, each grid cell begins as being sensitive to one, and only one, region in the "home" environment used for learning; we can sort them by location preference to visualize them in 1D space. The general setup of the problem can be visualized as the following (picture from Widloski & Fiete):

During training, the animal moves back and forth across the space, with cells in each population activated together according to their location preference. The excitatory cells preferring left movement will be active primarily when the animal runs left across the environment; likewise, the right-preferring cells will be active during rightward runs. This causes the development of asymmetric weight patterns that enable path integration, as we will later see.

The dynamics of the network are described by the following equation, where $R$ is the firing rate of a population of cells, $S$ is the long-term outgoing synaptic activations of that population, $I$ is the input into the population and $\tau$ is a fixed time decay of cell activation (usually 0.03 seconds): $$ R(I) = \max(I, 0)\\ \frac{dS}{dt} = R(I) - \frac{S}{\tau} $$

One key simplification made here is our use of deterministic neurons for all learning (hence the rule $R(I) = \max(I, 0)$). In their work, Widloski & Fiete almost exclusively employ stochastic spiking neurons, with a different rule for calculating firing rate, although they also provide dynamics for deterministic neurons, which we employ.

The core of the dynamics of the system is the input $I$, which varies depending on which population the cell belongs to. In the equations that follow, we denote cells in the right-excitatory population by the superscript $R$, cells in the left-excitatory population by the superscript $L$, and inhibitory cells by the superscript $I$.

$$\begin{align*} \text{Right-excitatory}\quad & I^{R} = (1 + 0.9v)(F - W^{IR}S^{I})\\ \text{Left-excitatory}\quad & I^{L} = (1 - 0.9v)(F - W^{IL}S^{I})\\ \text{Inhibitory}\quad & I^{I} = F + W^{RI}S^{R} + W^{LI}S^{L} - W^{II}S^{II} \\ \end{align*}$$Where $R_{I}$, $R_{L}$ and $R_{R}$ are the activations of the inhibitory, left-excitatory and right-excitatory cells, respectively, $W^{PP'}$ is a weight matrix of connections from cells in population P to population P', and the row $W_n$ is the weights for connections incoming to cell $n$. $F$, meanwhile, is the feedforward input from the place code (a Gaussian bump around the current location, as in the diagram above). Note that the left-excitatory and right-excitatory cells are sensitive to velocity, with a positive velocity defined as rightward movement. Also, note that activations are constrained to always be positive or zero (this is equivalent to applying the RELU nonlinearity).

Learning is handled via spike-time dependent plasticity kernels. The full integral for the change in weight from cell $i$ in population $P$ to cell $j$ in population $P'$ is given as follows: $$\Delta W_{ij} = \eta_{PP'} \int_{t = 0}^{\infty} R_i(t)\; k_{+}^{PP'}(\tau)\; R_j(t - \tau) \; d\tau - \eta_{PP'}\int_{t = 0}^{\infty} R_i(t)\; k_{-}^{PP'}(\tau)\; R_j(t + \tau) \; d\tau$$

where $\eta_{AB}$ is a specific learning rate for synapses from population $P$ to population $P'$ and $k$ is an STDP kernel; there are separate kernels with slightly different parameters for each combination of cell types and for causal ($+$) and acausal ($-$) learning. We have three different kernels in use, with slightly different scales and time windows. Detailed parameters are available in the attached code, but the core is as follows:

- Inhibitory-to-inhibitory synapses learn with Hebbian STDP
- Inhibitory-to-excitatory synapses learn with anti-Hebbian STDP
- Excitatory-to-inhibitory synapses learn with anti-Hebbian STDP

Finally, after training, we impose the restriction that all weights must be positive, meaning that inhibitory neurons will inhibit cells and excitatory neurons will excite cells. In neuroscience, this is known as Dale's law, and is required for biological plausibility. It is also beneficial for the stability of the network; without it, the inhibitory cells would actually have a strong excitatory effect on their local neighborhoods, which would cause the network to diverge without the addition of further constraints on the dynamics.

We have modified the learning process and network introduced in Widloski & Fiete in several ways. First of all, we use anti-Hebbian STDP for excitatory-inhibitory synapses, which we find to be necessary to cause the network to path integrate in the intuitively "correct" direction. The possibility of using anti-Hebbian STDP here was suggested in the original paper, and constitutes a minor tweak at most. More substantially, we also allow the network to learn with recurrent connections active, unlike in the original paper, which disabled them for the entire training process to improve stability and did not show results for training with recurrent connections enabled. Finally, we include support for learning with random-speed trajectories; we find that this results in slightly worse path integration, but otherwise does not compromise results.

```
In [ ]:
```# Imports and notebook statements
%load_ext autoreload
%autoreload 2
%matplotlib notebook
from htmresearch.frameworks.grid_cell_learning.CAN import CAN1DNetwork
import numpy as np
from scipy import stats
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
import warnings
warnings.filterwarnings('ignore')

```
In [ ]:
```network = CAN1DNetwork(numExcitatory=200,
numInhibitory=200,
learningRate=.005,
dt=0.01, # Time step for numerical integration
decayConstant=.03, # Decay rate of neuronal activation
stdpWindow=10, # How long a time window in time steps to use for STDP updates.
)

To learn the grid cell network, we set the animal to move back and forth across the environment (shown as the bump of activity in the top two figures when the attached code is run). Note that here we are treating the environment as being toroidal; this is simply a convenient simplifying assumption, and the same results can be achieved (albeit with slightly poorer path integration) if the environment is treated as being bounded.

During this learning process, observe the matrix of inhibitory-inhibitory connections (bottom figure produced by the code). As the animal proceeds back and forth across the environment, a distinct banded structure emerges, where inhibitory cells inhibit cells a fixed distance away from them, but do not inhibit cells very close to themselves. This is roughly equivalent to the "Mexican hat" connectivity pattern sometimes associated with CAN networks.

These weights develop due to asymmetries in the STDP integral, $\Delta W_{ij} = \eta_{PP'} \int_{t = 0}^{\infty} R_i(t)\; k_{+}^{PP'}(\tau)\; R_j(t - \tau) \; d\tau - \eta_{PP'}\int_{t = 0}^{\infty} R_i(t)\; k_{-}^{PP'}(\tau)\; R_j(t + \tau) \; d\tau$. The Mexican hat pattern learned in the inhibitory connections results directly from asymmetries between the STDP kernels $k^{II}_+$ and $k^{II}_-$, with the $k^{II}_{+}$ kernel having a longer time scale. These asymmetries cause connections formed further away from the main diagonal during a pass in one direction not to be entirely overwritten during a pass in the other direction, while connections near the main diagonal are. In the following diagram, you can see this effect:

Once we restrict weights to being non-negative (if we did not do this, inhibitory neurons would have an excitatory effect on cells near them!), we see the weight pattern that we expect to:

This can be seen directly in the I-I weight matrix during learning (bottom figure) -- observe how the weights near the main diagonal are overwritten while those further away are not.

For inhibitory-excitatory and excitatory-inhibitory connections, the learning mechanism is much simpler. During rightward runs, the right-excitatory cells will be highly active; thus, they will learn connections to cells to their right, which fire later, while cells to their left will connect to them as they fired earlier (note that this is flipped with anti-Hebbian STDP). However, right-excitatory cells will be less active on leftward runs, so their newly-updated weight patterns will not be overwritten. The same applies in reverse for left-excitatory cells.

In the original paper, learning was done with recurrent connections (i.e., all inputs to cells other than the feedforward input) disabled. One could argue, however, that it would be more biologically plausible if learning were done with all connections active. As an extension to the original work, we demonstrate that this is possible, if the learning rate is properly controlled. To test out this parameter, we recommend simply setting "recurrent=True" in the call below.

Similarly, we also enable support for movement across the environment at differing speeds. To test this, you can set "randomSpeed=True" in the learning call below.

Enabling both recurrent connections and movement at random speeds, however, can in fact cause substantial instability in the resulting synaptic weight patterns. To counteract this, we recommend learning with a schedule of decreasing learning rates.

```
In [ ]:
```# Do a set of two passes, starting from the left first and then from the right.
# We have to do one from each direction to learn the asymmetric weights.
network.learn(1, # passes in each direction
dir=1, # start going left-to-right first
periodic=True, # toroidal environment
recurrent=False, # Have the recurrent connections on during training. Slightly hurts performance.
randomSpeed=False, # Sets each have a random speed between 0.5 and 1.5x normal
# This should ordinarily be disabled; it is possible to learn the correct connections
# with it active, but it requires many more passes and gradual learning rate decay.
)
# Now do 19 more sets, without plotting to speed things up. Should take just over three minutes.
# If recurrent connections are disabled during training, the number of sets can be reduced dramatically and
# the learning rate increased instead.
network.plotting=False
network.learn(9, dir=1, periodic=True, recurrent=False, randomSpeed=False)
network.learn(10, dir=-1, periodic=True, recurrent=False, randomSpeed=False)
# Reactivate plotting for later
network.plotting=True

Now, we can visualize the resulting weights more clearly. Note that here we have enforced Dale's law, unlike in the above figures; this simply means that we have stopped inhibitory cells from having excitatory effects, and vice-versa. This primarily affects the main diagonal in each figure.

```
In [ ]:
```plt.matshow(network.weightsII, cmap=plt.cm.coolwarm)
plt.xlabel("Inhibitory-Inhibitory connections")
plt.show()

This I-I weight structure guarantees the stable, gridlike firing pattern exhibited by the network. It is not, however, sufficient for path integration. For that, we must have the left and right excitatory populations behave differently. Fortunately, simply Hebbian learning causes this to be the case, both in their connections to the inhibitory cells and in the inhibitory cells' connections to them.

Note the asymmetry in the following weight structures. The E-I connections have left excitatory cells drive inhibitory cells to their right (causing the pattern on the grid to shift left), while the right E-I connections do the reverse. This represents path integration. Unfortunately, this format is dependent on our choice of anti-Hebbian STDP for these connections; were we to use normal Hebbian STDP, the network would actually learn to path integrate backwards!

```
In [ ]:
```fig = plt.figure()
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
ax1.matshow(network.weightsELI, cmap=plt.cm.coolwarm)
ax1.set_xlabel("Left E-I connections")
ax2.matshow(network.weightsERI, cmap=plt.cm.coolwarm)
ax2.set_xlabel("Right E-I connections")
plt.tight_layout()
plt.show()

```
In [ ]:
```fig = plt.figure()
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
ax1.matshow(network.weightsIEL, cmap=plt.cm.coolwarm)
ax1.set_xlabel("I-Left E connections")
ax2.matshow(network.weightsIER, cmap=plt.cm.coolwarm)
ax2.set_xlabel("I-Right E connections")
plt.tight_layout()
plt.show()

We now simulate the network in two different modes; one where the animal is standing still, and one where the animal is moving. In the first case, we show that the continuous attractor network produces a stable network with multiple firing fields, and in the second we show that the continuous attractor network is capable of conducting accurate path integration.

Here, instead of providing a place code input to a single cell or set of cells, we provide a global feedforward input of constant magnitude to all cells. This makes up for the lack of any connections from excitatory cells to other excitatory cells, and enables the inhibition to thus regulate the network properly.

```
In [ ]:
```network.plotting = True
network.simulate(time=10,
feedforwardInputI=np.ones((200,)), # The global input to give all inhibitory cells. Constant.
feedforwardInputE=np.ones((200,)), # Also constant for excitatory cells.
v=0.0, # Standing still
dt=0.0001, # Use a smaller time step for simulation, for stability
recurrent=True, # Recurrent connections should now be active
inputNoise=None, # Noise can be added to the input without compromising results.
# Try using a noise value of 2.
)

```
In [ ]:
```trueVelocities, estimatedVelocities = network.calculatePathIntegrationError(time=25,
dt=0.001,
inputNoise=None,
)

```
In [ ]:
```g = sns.JointGrid(np.asarray(estimatedVelocities), np.asarray(trueVelocities), ratio=100)
g.plot_joint(sns.regplot)
g.annotate(stats.pearsonr)
g.ax_marg_x.set_axis_off()
g.ax_marg_y.set_axis_off()
plt.title("Path Integration Accuracy")
plt.xlabel("Estimated velocity (cells)")
plt.ylabel("True velocity (m/s)")
plt.tight_layout()
plt.show()

```
In [ ]:
```model = LinearRegression().fit(estimatedVelocities.reshape(-1, 1), trueVelocities)
predictedValues = model.predict(estimatedVelocities.reshape(-1, 1))
meanAbsoluteError = np.mean((np.abs(predictedValues - trueVelocities)))
print("MAE is {}".format(meanAbsoluteError))
percentError = meanAbsoluteError/np.mean(np.abs(trueVelocities))
print("Average error is {}% of average movement speed".format(str(percentError*100)[:4]))

This work by Widloski & Fiete is of great importance; they provide one of the first reasonable demonstrations of how a CAN-like grid cell network could be learned using only simple learning rules over a small amount of time (they estimate that <24 hours of exploration would be sufficient, even in a 2D case). The grid cell networks that result generally conform to our expectations of grid cell networks in almost every way, possessing gridlike firing fields, stable activation patterns when movement is not present, and reasonably-accurate path integration.

That said, this approach is dependent upon the assumption that grid cells begin as, effectively, place cells that are specifically bound to unique locations in the environment. Although this model demonstrates that path integration and continuous attractor dynamics can be learned provided such a mapping, it does not provide a mechanism for how such a mapping might be learned, either independently or in tandem with path integration.