SOM Retinotopy

This IPython notebook defines and explores the Kohonen SOM (self-organizing map) model of retinotopy described in pages 53-59 of:

Miikkulainen, Bednar, Choe, and Sirosh (2005), Computational Maps in the Visual Cortex, Springer.

If you can double-click on this text and edit it, you are in a live IPython Notebook environment where you can run the code and explore the model. Otherwise, you are viewing a static (e.g. HTML) copy of the notebook, which allows you to see the precomputed results only. To switch to the live notebook, see the notebook installation instructions.

This IPython notebook constructs the definition of the SOM retinotopy model in the Topographica simulator, and shows how it organizes.

A static version of this notebook may be viewed online. To run the live notebook and explore the model interactively, you will need both IPython Notebook and Topographica.

If you prefer the older Tk GUI interface or a command line, you may use the som_retinotopy.ty script distributed with Topographica, without IPython Notebook, as follows:

./topographica -g examples/som_retinotopy.ty

To run this Notebook version, you can run all cells in order automatically:

Selecting Kernel -> Restart from the menu above.
Selecting Cell -> Run All.

Alternatively, you may run each cell in sequence, starting at the top of the notebook and pressing Shift + Enter.

Model definition

The next three cells define the SOM model in its entirety (copied from examples/som_retinotopy.ty). First, we import required libraries and we declare various parameters to allow the modeller to control the behavior:



In [ ]:

    
"""
Basic example of a fully connected SOM retinotopic map with ConnectionFields.

Contains a Retina (2D Gaussian generator) fully connected to a V1
(SOM) sheet, with no initial ordering for topography.

Constructed to match the retinotopic simulation from page 53-59 of
Miikkulainen, Bednar, Choe, and Sirosh (2005), Computational Maps in
the Visual Cortex, Springer.  Known differences include:

 - The cortex_density and retina_density are smaller for
   speed (compared to 40 and 24 in the book).
 - The original simulation used a radius_0 of 13.3/40, which does work
   for some random seeds, but a much larger radius is used here so that
   it converges more reliably.
"""

import topo
import imagen

from math import exp, sqrt

import param

from topo import learningfn,numbergen,transferfn,pattern,projection,responsefn,sheet

import topo.learningfn.projfn
import topo.pattern.random
import topo.responsefn.optimized
import topo.transferfn.misc

# Disable the measurement progress bar for this notebook
from topo.analysis.featureresponses import pattern_response
pattern_response.progress_bar = False

# Parameters that can be passed on the command line using -p
from topo.misc.commandline import global_params as p
p.add(

    retina_density=param.Number(default=10.0,bounds=(0,None),
        inclusive_bounds=(False,True),doc="""
        The nominal_density to use for the retina."""),

    cortex_density=param.Number(default=10.0,bounds=(0,None),
        inclusive_bounds=(False,True),doc="""
        The nominal_density to use for V1."""),

    input_seed=param.Number(default=0,bounds=(0,None),doc="""
        Seed for the pseudorandom number generator controlling the
        input patterns."""),

    weight_seed=param.Number(default=0,bounds=(0,None),doc="""
        Seed for the pseudorandom number generator controlling the
        initial weight values."""),

    radius_0=param.Number(default=1.0,bounds=(0,None),doc="""
        Starting radius for the neighborhood function."""),

    alpha_0=param.Number(default=0.42,bounds=(


0,None),doc="""
        Starting value for the learning rate."""))

Customizing the model parameters

The cell below may be used to modify any of the parameters defined above, allowing you to explore the parameter space of the model. To illustrate, the default retina and cortex densities are set to 10 (their default values), but you may change the value of any parameter in this way:



In [ ]:

    
p.retina_density=10
p.cortex_density=10

The online version of this notebook normally uses the full retinal and cortical densities of 24 and 40, respectively, but densities of 10 are the default because they should be sufficient for most purposes. Using the parameters and definitions above, the following code defines the SOM model of retinotopy, by defining the input patterns, the Sheets, and the connections between Sheets:



In [ ]:

    
# Using time dependent random streams.
# Corresponding numbergen objects should be given suitable names.
param.Dynamic.time_dependent=True
numbergen.RandomDistribution.time_dependent = True

# input pattern
sheet.GeneratorSheet.period = 1.0
sheet.GeneratorSheet.phase = 0.05
sheet.GeneratorSheet.nominal_density = p.retina_density

input_pattern = pattern.Gaussian(scale=1.0,size=2*sqrt(2.0*0.1*24.0)/24.0,
    aspect_ratio=1.0,orientation=0,
    x=numbergen.UniformRandom(name='xgen', lbound=-0.5,ubound=0.5, seed=p.input_seed+12),
    y=numbergen.UniformRandom(name='ygen', lbound=-0.5,ubound=0.5, seed=p.input_seed+56))


# Local variable to allow weights to be controlled easily
param.random_seed = p.weight_seed + 86

topo.sim['Retina'] = sheet.GeneratorSheet(input_generator=input_pattern)

topo.sim['V1'] = sheet.CFSheet(
    nominal_density = p.cortex_density,
    # Original CMVC simulation used an initial radius of 13.3/40.0 (~0.33)
    output_fns=[transferfn.misc.KernelMax(density=p.cortex_density,
        kernel_radius=numbergen.BoundedNumber(
            bounds=(0.5/40,None),
            generator=numbergen.ExponentialDecay(
                starting_value=p.radius_0,
                time_constant=40000/5.0)))])

topo.sim.connect('Retina','V1',name='Afferent',delay=0.05,
    connection_type=projection.CFProjection,
    weights_generator = pattern.random.UniformRandom(),
    nominal_bounds_template=sheet.BoundingBox(radius=1.0), # fully connected network.
    learning_rate=numbergen.ExponentialDecay(
        starting_value = p.alpha_0,
        time_constant=40000/6.0),
    response_fn = responsefn.optimized.CFPRF_EuclideanDistance_opt(),
    learning_fn = learningfn.projfn.CFPLF_EuclideanHebbian())

'Loaded the self-organizing map model (som_retinotopy)'

Exploring the model



In [ ]:

    
%reload_ext topo.misc.ipython
import numpy as np
from holoviews import Matrix
from holoviews.operation import RGBA

Now the model has been defined, we can explore it. The structure of the loaded model is shown in this screenshot taken from the Tk GUI's Model Editor:

Model structure

The large circle indicates that the Retina Sheet is fully connected to the V1 Sheet.

Initial weights

The plot below shows the initial set of weights from a 10x10 subset of the V1 neurons (i.e., every neuron with the reduced cortex_density, or every fourth neuron for cortex_density=40):



In [ ]:

    
topo.sim.V1.Afferent.grid()

As we can see, the initial weights are uniform random. Each neuron receives a full set of connections from all input units, and thus each has a 24x24 or 10x10 array of weights (depending on the retina_density).

Initial Center-of-Gravity (CoG) plots

We can visualize the center of gravity (CoG) of the V1 Afferent weights using the following measurement command:



In [ ]:

    
from topo.command.analysis import measure_cog, measure_or_pref
cog_data = measure_cog()

The center of gravity (a.k.a. centroid or center of mass) is computed for each neuron using its set of incoming weights. The plot below shows each neuron represented by a point, with a line segment drawn from each neuron to each of its four immediate neighbors so that neighborhood relationships (if any) will be visible.



In [ ]:

    
cog_data.CoG.Afferent

From this plot is is clear that all of the neurons have a CoG near the center of the retina, which is to be expected because the weights are fully connected and evenly distributed (and thus all have an average (X,Y) value near the center of the retina). This plot presents the center-of-gravity values computed in the X and Y directions, shown below:



In [ ]:

    
cog_data.XCoG.Afferent + cog_data.YCoG.Afferent

The V1 X CoG plot shows the X location preferred by each neuron, and the V1 Y CoG plot shows the preferred Y locations. The monochrome values are scaled so that the neuron with the smallest X preference is colored black, and that with the largest is colored white, regardless of the absolute preference values (due to normalization being enabled with the .N property). Thus the absolute values of the X or Y preferences are not visible in these plots. (Without normalization, values below 0.0 are cropped to black, so only normalized plots are useful for this particular example.)



In [ ]:

    
%%channels X CoG * Y CoG * Blank => RGBA []
blue_channel = Matrix(np.zeros(topo.sim.V1.shape), label='Blank')
cog_data.XCoG.Afferent.N * cog_data.YCoG.Afferent.N * blue_channel

The colorful plot above shows a false-color visualization of the CoG values, where the amount of red in the plot is proportional to the X CoG, and the amount of green in the plot is proportional to the Y CoG. Where both X and Y are low, the plot is black or very dark, and where both are high the plot is yellow (because red and green light together appears yellow). This provides a way to visualize how smoothly the combined (X,Y) position is mapped, although at this stage of training it is not particularly useful.

Initial activity

This two plots below shows the response for each Sheet in the model, which is zero at the start of the simulation (and thus both plots are black).



In [ ]:

    
topo.sim.Retina[:] + topo.sim.V1[:]

Training stimuli

We can have a look at what the training patterns that will be used to train the model without having to run it. In the cell labelled In [4] (in the model definition), we see where the training patterns are defined in a variable called input_pattern. We see that the circular Gaussian stimulus has x and y values that are drawn from a random distribution by two independentnumbergen.UniformRandomobjects. We can now view what 100 frames of training patterns will look like:

The videos below will only be displayed by default if your browser supports WebM HTML5 video



In [ ]:

    
%view webm:20

If you have problems with the animation, change webm to gif in the above magic. Note that this results in huge notebook file sizes and does not allow animations to be paused and restarted, so it's best if you use a suitable browser with HTML5/Webm support.



In [ ]:

    
input_pattern.anim(frames=30, offset=0.05)

Preparing animated plots

At the end of the notebook, we will generate a set of nice animations showing the plots we have already shown evolve over development. We now create a Collector object that collects all information needed for plotting and animation. We will collect the information we have just examined and advance the simulation one iteration:



In [ ]:

    
from topo.analysis import Collector
from topo.command.analysis import measure_or_pref, measure_cog

c = Collector()
c.collect(measure_or_pref)
c.collect(measure_cog)

c.Activity.Retina =      c.collect(topo.sim.Retina)
c.Activity.V1 =          c.collect(topo.sim.V1)
c.Activity.V1Afferent =  c.collect(topo.sim.V1.Afferent)
c.CFs.Afferent =         c.collect(topo.sim.V1.Afferent, grid=True, rows=10, cols=10)

Activity after a single training iteration

After running the model a single iteration, the sheet activities now look as follows:



In [ ]:

    
data = c(times=[1])



In [ ]:

    
data.Activity.Retina + data.Activity.V1

In the Retina plot, each photoreceptor is represented as a pixel whose shade of grey codes the response level, increasing from black to white. As expected, this particular example matches the first frame we visualized in the training stimulus animation, and the location of the Gaussian is near the border of the retina. The V1 plot shows the response to that input, which for a SOM is initially a large Gaussian-shaped blob centered around the maximally responding unit.

Projection Activity

To see what the responses were before SOM’s neighborhood function forced them into a Gaussian shape, you can look at the Projection Activity plot, which shows the feedforward activity in V1:



In [ ]:

    
data.Activity.V1Afferent

Here these responses are best thought of as Euclidean proximity, not distance. This formulation of the SOM response function actually subtracts the distances from the max distance, to ensure that the response will be larger for smaller Euclidean distances (as one intuitively expects for a neural response). The V1 feedforward activity appears random because the Euclidean distance from the input vector to the initial random weight vector is random.

Learning after a few iteration

If you look at the weights now we have run a single training iteration, you'll see that most of the neurons have learned new weight patterns based on this input:



In [ ]:

    
data.CFs.Afferent

Some of the weights to each neuron have now changed due to learning. In the SOM algorithm, the unit with the maximum response (i.e., the minimum Euclidean distance between its weight vector and the input pattern) is chosen, and the weights of units within a circular area defined by a Gaussian-shaped neighborhood function around this neuron are updated.

This effect is visible in this plot – a few neurons around the winning unit at the top middle have changed their weights. Let us run a few more iterations before having another look:



In [ ]:

    
topo.sim.run(4)
topo.sim.V1.Afferent.grid()

The weights have been updated again - it is clear after these five iterations that the input patterns are becoming represented in the weight patterns, though not very cleanly yet. We also see that the projection activity patterns are becoming smoother, since the weight vectors are now similar between neighboring neurons.



In [ ]:

    
topo.sim.V1.Afferent.projection_view().N

We will now advance to simulation time 250 because we will want to make animations regularly sampled every 250 steps:



In [ ]:

    
topo.sim.run(245)

At 5000 iterations

Let us use our Collector object c to collect more measurements. We will start collecting measurements every 250 steps until we complete 5000 iterations, which should take a few seconds at the default densities, and may be a minute or two at the higher densities:



In [ ]:

    
times = [topo.sim.time()*i for i in range(1,21)]
print("Running %d measurements between iteration %s and iteration %s" % (len(times), min(times), max(times)))



In [ ]:

    
data = c(data, times=times)

We see that the topographic grid plot and the activity evolves over this period as follows:

In the live notebook you can remove the ``.last`` property to view an animation to 5000 iterations

The X and Y CoG plots are now smooth, but not yet the axis-aligned gradients (e.g. left to right and bottom to top) that an optimal topographic mapping would have:



In [ ]:

    
(data.XCoG.Afferent.N + data.YCoG.Afferent.N).last

The false-color visualization has also smoothed out:



In [ ]:

    
%%channels X CoG * Y CoG * Blank => RGBA []
blue_channel = Matrix(np.zeros(topo.sim.V1.shape), label='Blank')
(data.XCoG.Afferent.N * data.YCoG.Afferent.N * blue_channel).last

The weight patterns are still quite broad and not very selective for typical input patterns:



In [ ]:

    
data.CFs.Afferent.last

At 10000 iterations

Additional training up to 10000 iterations (which becomes faster due to a smaller neighborhood radius) leads to a nearly flat, square map:



In [ ]:

    
times = [5000+(250*i) for i in range(1,21)]
print("Running %d measurements between iteration %s and iteration %s" % (len(times), min(times), max(times)))



In [ ]:

    
data = c(data, times=times)

In the live notebook you can remove the ``.last`` property to view an animation to 10000 iterations



In [ ]:

    
data.CoG.Afferent.last

Yet the weight patterns are still quite broad and not very selective for typical input patterns, because the neighborhood radius (initially 1.0 in Sheet coordinates, i.e. larger than the entire V1) is still large enough to force neighbors to respond to a wide range of patterns:



In [ ]:

    
topo.sim.V1.output_fns[0].kernel_radius



In [ ]:

    
data.CFs.Afferent.last

At 30000 iterations

By 30000 iterations the map has good coverage of the available portion of the input space:



In [ ]:

    
times = [10000+(250*i) for i in range(1,81)]
print("Running %d measurements between iteration %s and iteration %s" % (len(times), min(times), max(times)))



In [ ]:

    
data = c(data, times=times)



In [ ]:

    
data.CoG.Afferent.last

The final projection plot at 30000 now shows that each neuron has developed weights concentrated in a small part of the input space, matching a prototypical input at one location:



In [ ]:

    
topo.sim.V1.output_fns[0].kernel_radius



In [ ]:

    
data.CFs.Afferent.last

The final topographic mapping may have any one of a number of possible arrangements, flipped or rotated by 90 degrees along any axis or both axes. E.g. a vertically flipped map will have blobs in each CF that are at the bottom for the top row of neurons, but the top for the bottom row of neurons. Nothing in the network reliably drives it to have any particular overall orientation or mapping apart from aligning to the square shape, and each of these possible organizations is equivalent functionally.

Animations over 30000 iterations

We can watch how the topographic mapping unfolds over all 30000 iterations we have run:



In [ ]:

    
%%opts Gravity_Contours [rescale_individually=False]
data.CoG.Afferent

And how the XCoG and YCoG components unfold:



In [ ]:

    
data.XCoG.Afferent  + data.YCoG.Afferent

As well as the false colour CoG plot, where a perfectly organized mapping will have black in one corner and yellow in the opposite, with green and red in the two other corners, and smooth colors between each corner:



In [ ]:

    
%%channels X CoG * Y CoG * Blank => RGBA []
blue_channel = Matrix(np.zeros(topo.sim.V1.shape), label='Blank')
data.XCoG.Afferent.N * data.YCoG.Afferent.N * blue_channel

The weights (generating this animation may take a couple of minutes):



In [ ]:

    
data.CFs.Afferent

Snapshots of the retinal and V1 activity over development. Notice how the activity in V1 becomes more focused over time as the kernel radius decreases:



In [ ]:

    
data.Activity.Retina + data.Activity.V1

Exploring the parameters of the SOM

Now, you can re-run the basic simulation by restarting the IPython Notebook Kernel (Kernel -> Restart) or using the keyboard Shortcut Ctrl+M . (see Help -> Keyboard Shortcuts for the full list). Then you can change one of the parameter values, either by editing the model definition above before running those cells, or (usually clearer) by setting the parameters in the cell labelled In [2].

For instance, the starting value of the neighborhood radius (from which all future values are calculated according to exponential decay) is 1.0. You can change this value as you see fit, e.g. to 0.1, by setting p.radius_0=0.1. With such a small learning radius, global ordering is unlikely to happen, and one can expect the topographic grid not to flatten out (despite local order in patches).

Similarly, consider changing the initial learning rate from 0.42 to e.g. 1.0 (e.g. by setting p.alpha_0=1.0 in cell 2). The retina and V1 densities cannot be changed after the simulation has started, but again, they can be changed by providing their values above and restarting the notebook.

You can also try changing the input_seed (p.input_seed=XX), to get a different stream of inputs, or weight_seed (p.weight_seed=XX), to get a different set of initial weights. With some of these values, you may encounter cases where the SOM fails to converge even though it seems to be working properly otherwise. For instance, some seed values result in topological defects.