Symbols and Symbol-like representation in neurons

We've seen how to represent vectors in neurons
- And how to compute functions on those vectors
- And dynamical systems
But how can we do anything like human language?
- How could we represent the fact that "the number after 8 is 9"
- Or "dogs chase cats"
- Or "Anne knows that Bill thinks that Charlie likes Dave"
Does the NEF help us at all with this?
- Or is this just too hard a problem yet?

Traditional Cognitive Science

Lots of theories that work with structured information like this
Pretty much all of them use some representation framework like this:
- after(eight, nine)
- chase(dogs, cats)
- knows(Anne, thinks(Bill, likes(Charlie, Dave)))
Or perhaps
- [number:eight next:nine]
- [subject:dogs action:chase object:cats]
- [subject:Anne action:knows object:[subject:Bill action:thinks object:[subject:Charlie action:likes object:Dave]]]
Cognitive models manipulate these sorts of representations
- mental arithmetic
- driving a car
- using a GUI
- parsing language
- etc etc
Seems to match well to behavioural data, so something like this should be right
So how can we do this in neurons?

Possible solutions

Oscilations
- "red square and blue circle"
- Different patterns of activity for RED, SQUARE, BLUE, and CIRCLE
- Have the patterns for RED and SQUARE happen, then BLUE and CIRCLE, then back to RED and SQUARE
- More complex structures possible too:
- E.g. the LISA architecture

Problems
- What controls this oscillation?
- How is it controlled?
- How do we deal with the exponentional explosion of nodes needed?
Implementing Symbol Systems in Neurons
- Build a general-purpose symbol-binding system
- Lots of temporary pools of neurons
- Ways to temporarily associate them with particular concepts
- Ways to temporarily associate pools together
- Neural Blackboard Architecture

Problems
- Very particular structure (doesn't seem to match biology)
- Uses a very large number of neurons (~500 million) to be flexible enough for simple sentences
- And that's just to represent the sentence, never mind controlling and manipulating it

Vector Symbolic Architectures

There is an alternate approach
Something that's similar to the symbolic approach, but much more tied to biology
- Most of the same capabilities as the classic symbol systems
- But not all
Based on vectors and functions on those vectors
- There is a vector for each concept
- Build up structures by doing math on those vectors
Example
- blue square and red circle
- can't just do BLUE+SQUARE+RED+CIRCLE
- need some other operation as well
- requirements
  - input 2 vectors, get a new vector as output
  - reversible (given the output and one of the input vectors, generate the other input vector)
  - output vector is highly dissimilar to either input vector
    - unlike addition, where the output is highly similar

Lots of different options
- for binary vectors, XOR works pretty good
- for continuous vectors we use circular convolution
Why?
- Extensively studied (Plate, 1997: Holographic Reduced Representations)
- Easy to approximately invert (circular correlation)
BLUE $\circledast$ SQUARE + RED $\circledast$ CIRCLE

Lots of nice properties
- Can store complex structures
  - [number:eight next:nine]
  - NUMBER $\circledast$ EIGHT + NEXT $\circledast$ NINE
  - [subject:Anne action:knows object:[subject:Bill action:thinks object:[subject:Charlie action:likes object:Dave]]]
  - SUBJ $\circledast$ ANNE + ACT $\circledast$ KNOWS + OBJ $\circledast$ (SUBJ $\circledast$ BILL + ACT $\circledast$ THINKS + OBJ $\circledast$ (SUBJ $\circledast$ CHARLIE + ACT $\circledast$ LIKES + OBJ $\circledast$ DAVE))
- But gracefully degrades!
  - as representation gets more complex, the accuracy of breaking it apart decreases
- Keeps similarity information
  - if RED is similar to PINK then RED $\circledast$ CIRCLE is similar to PINK $\circledast$ CIRCLE
But rather complicated
- Seems like a weird operation for neurons to do

Circular convolution in the NEF

Or is it?
Circular convolution is a whole bunch ($D^2$) of multiplies
But it can also be written as a fourier transform, an elementwise multiply, and another fourier transform
The discrete fourier transform is just a linear operation
So that's just $D$ pairwise multiplies
In fact, circular convolution turns out to be exactly what the NEF shows neurons are good at

Building a memory in Nengo



In [2]:

    
import nengo
import nengo.spa as spa

D=64

model = spa.SPA(label='Binding')
with model:
    model.a = spa.Buffer(D)
    model.b = spa.Buffer(D)
    model.c = spa.Buffer(D)
    model.q = spa.Buffer(D)
    model.r = spa.Buffer(D)
    model.cortical = spa.Cortical(spa.Actions(
        'c = a*b',
        'c = c',
        'r = c*~q'), synapse=0.1)

        
    nengo.Probe(model.a.state.output)
    nengo.Probe(model.b.state.output)
    nengo.Probe(model.c.state.output)
    nengo.Probe(model.q.state.output)
    nengo.Probe(model.r.state.output)

How does this work so well?
- Exploiting the features of high-dimensional space

Memory capacity increases with dimensionality
- Also dependent on the number of different possible items in memory (vocabulary size)
512 dimensions is suffienct to store ~8 pairs, with a vocabulary size of 100,000 terms
- Note that this is what's needed for storing simple sentences

Symbol-like manipulation

Can do a lot of standard symbol stuff
Have to explicitly bind and unbind to manipulate the data
Less accuracy for more complex structures
But we can also do more with these representations

Raven's Progressive Matrices

An IQ test that's generally considered to be the best at measuring general-purpose "fluid" intelligence
- nonverbal (so it's not measuring language skills, and fairly unbiased across cultures, hopefully)
- fill in the blank
- given eight possible answers; pick one

This is not an actual question on the test
- The test is copyrighted
- They don't want the test to leak out, since it's been the same set of 60 questions since 1936
- But they do look like that
How can we model people doing this task?
A fair number of different attempts
- None neural
- Generally use the approach of building in a large set of different types of patterns to look for, and then trying them all in turn
- Which seems wrong for a test that's supposed to be about flexible, fluid intelligence
Does this vector approach offer an alternative?
First we need to represent the different patterns as a vector
- This is a hard image interpretation problem
- Still ongoing work here
- So we'll skip it and start with things in vector form

How do we represent a picture?
- SHAPE $\circledast$ ARROW + NUMBER $\circledast$ ONE + DIRECTION $\circledast$ UP
- can do variations like this for all the pictures
- fairly standard with most assumptions about how people represent complex scenes
- but that part is not being modelled (yet!)
We have shown that it's possible to build these sorts of representations up directly from visual stimuli
- With a very simple vision system that can only recognize a few different shapes
- And where items have to be shown sequentially as it has no way of moving its eyes

The memory of the list is built up by using a basal ganglia action selection system to control feeding values into an integrator
- The thought bubble shows how close the decoded values are to the ideal
- Notice the forgetting!
The same system can be used to do a version of the Raven's Matrices task

S1 = ONE $\circledast$ P1
S2 = ONE $\circledast$ P1 + ONE $\circledast$ P2
S3 = ONE $\circledast$ P1 + ONE $\circledast$ P2 + ONE $\circledast$ P3
S4 = FOUR $\circledast$ P1
S5 = FOUR $\circledast$ P1 + FOUR $\circledast$ P2
S6 = FOUR $\circledast$ P1 + FOUR $\circledast$ P2 + FOUR $\circledast$ P3
S7 = FIVE $\circledast$ P1
S8 = FIVE $\circledast$ P1 + FIVE $\circledast$ P2
what is S9?

Let's figure out what the transformation is
T1 = S2 $\circledast$ S1'
T2 = S3 $\circledast$ S2'
T3 = S5 $\circledast$ S4'
T4 = S6 $\circledast$ S5'
T5 = S8 $\circledast$ S7'
T = (T1 + T2 + T3 + T4 + T5)/5
S9 = S8 $\circledast$ T
S9 = FIVE $\circledast$ P1 + FIVE $\circledast$ P2 + FIVE $\circledast$ P3
This becomes a novel way of manipulating structured information
- Exploiting the fact that it is a vector underneath
- A spiking neural model applied to the study of human performance and cognitive decline on Raven's Advanced Progressive Matrices

Things to note
- Memory slowly decays
- If you push in a new pair for too long, it can wipe out the old pair(s)
  - Note that this relies on the saturation behaviour of NEF networks
  - Kind of like implicit normalization

Cognitive Control

How do we control these systems?
- Lots of components
- Each component computes some particular function
- Need to selectively route information between components
Standard cortex-basal ganglia-thalamus loop

compute functions of cortical state to determine utility of actions
basal ganglia select the action with the highest utility
- using Gurney, Prescott, and Redgrave, 2001 model, converted to spiking neurons
thalamus has routing connections between cortical areas
- if action is not selected, routing neurons are inhibited
good timing data

Dynamic Behaviour of a Spiking Model of Action Selection in the Basal Ganglia

Example: Simple Association



In [ ]:

    
import nengo
import nengo.spa as spa

model = spa.SPA(label="SPA1")
with model:
    model.state = spa.Buffer(16)
    model.motor = spa.Buffer(16)
    actions = spa.Actions(
        'dot(state, DOG) --> motor=BARK',
        'dot(state, CAT) --> motor=MEOW',
        'dot(state, RAT) --> motor=SQUEAK',
        'dot(state, COW) --> motor=MOO',
    )    
    model.bg = spa.BasalGanglia(actions)
    model.thalamus = spa.Thalamus(model.bg)
    
    nengo.Probe(model.state.state.output)
    nengo.Probe(model.motor.state.output)
    nengo.Probe(model.bg.input)
    nengo.Probe(model.thalamus.actions.output)

Example: Sequence



In [ ]:

    
import nengo
import nengo.spa as spa

model = spa.SPA(label="SPA2")
with model:
    model.state = spa.Buffer(16)
    actions = spa.Actions(
        'dot(state, A) --> state=B',
        'dot(state, B) --> state=C',
        'dot(state, C) --> state=D',
        'dot(state, D) --> state=E',
        'dot(state, E) --> state=A',
    )    
    model.bg = spa.BasalGanglia(actions)
    model.thalamus = spa.Thalamus(model.bg)
    
    nengo.Probe(model.state.state.output)
    nengo.Probe(model.bg.input)
    nengo.Probe(model.thalamus.actions.output)

Example: Input



In [ ]:

    
import nengo
import nengo.spa as spa

model = spa.SPA(label="SPA1")
with model:
    model.state = spa.Buffer(16)
    actions = spa.Actions(
        'dot(state, A) --> state=B',
        'dot(state, B) --> state=C',
        'dot(state, C) --> state=D',
        'dot(state, D) --> state=E',
        'dot(state, E) --> state=A',
    )    
    model.bg = spa.BasalGanglia(actions)
    model.thalamus = spa.Thalamus(model.bg)
    
    def state_in(t):
        if t<0.1:
            return 'C'
        else:
            return '0'
    model.input = spa.Input(state=state_in)
    
    nengo.Probe(model.state.state.output)
    nengo.Probe(model.bg.input)
    nengo.Probe(model.thalamus.actions.output)

Example: Routing



In [ ]:

    
import nengo
import nengo.spa as spa

model = spa.SPA(label="SPA1")
with model:
    model.vision = spa.Buffer(16)
    model.state = spa.Buffer(16)
    actions = spa.Actions(
        'dot(vision, A+B+C+D+E) --> state=vision',
        'dot(state, A) --> state=B',
        'dot(state, B) --> state=C',
        'dot(state, C) --> state=D',
        'dot(state, D) --> state=E',
        'dot(state, E) --> state=A',
    )    
    model.bg = spa.BasalGanglia(actions)
    model.thalamus = spa.Thalamus(model.bg)
    
    def vision_in(t):
        if t<0.1:
            return 'C'
        else:
            return '0'
    model.input = spa.Input(vision=vision_in)
    
    nengo.Probe(model.state.state.output)
    nengo.Probe(model.vision.state.output)
    nengo.Probe(model.bg.input)
    nengo.Probe(model.thalamus.actions.output)

Spaun

This process is the basis for building Spaun