Cognitive Computing 3 -- Cognition illuminated

In my last two lectures, you saw a good portion of Python. Enough to do quite a lot, either in your own (academic or professional) life, or in cognitive science -- building programs that do cognitive functions.

I now want to talk about some concepts and issues in cognitive science that are illuminated by (Python) programming, as well as some issues in computing that have implications for cognition.

The plan for this lecture:

  1. What is a 'computational model' of cognition, and why do we build them?
  2. Marr's levels of analysis of cognitive/computing systems
  3. Modularity (and preview of Wednesday's lecture on Modularity of Mind)
  4. A very informal, nontechnical introduction to 'algorithmic complexity'

To illustrate these points, let's consider the following problem:

When I'm listening to someone speak, how do I decipher the sounds of the words they utter? For example, how do I determine if someone said 'bad' or 'bat'? What distinguishes 'd' from 't'?

There's a lot of physical characteristics of the sound hitting my ear that differentiates 'd' from 't', but one is the length of the vowel before these two consonants. The 'a' before 'd' is longer than the 'a' before 't'. You can all probably just demonstrate this to yourselves now by comparing your pronunciations of 'bad' and 'bat'.

And in fact, humans do use such cues to categorize speech sounds. So let's imagine the following very simple...

Computational model of speech sound categorization


In [1]:
def d_or_t(preceding_vowel_length):
    # imagine that vowel length is in milliseconds
    if preceding_vowel_length > 200: #i'm just guessing that this is a reasonable boundary!
        return 'd'
    else:
        return 't'

print( d_or_t( preceding_vowel_length=250 ) )
print( d_or_t( preceding_vowel_length=150 ) )


d
t

So that is a computational model of cognition because it is a computationally-explicit theory of the process by which we accomplish some cognitive function. Put another way, it specifies an algorithm by which we do this particular function.

Awesome! We solved speech recognition!

Well, not really. Let's also consider the following: people speaker at different speeds, and correspondingly, people vary in their typical boundaries between consonants like 'd' and 't'.

Okay, let's try to account for that:


In [2]:
def d_or_t_extended(preceding_vowel_length, speaker):
    
    if speaker == 'Russell': # imagining if I was an atypically fast speaker
        boundary = 150
    else:
        boundary = 200

    if preceding_vowel_length > boundary: 
        return 'd'
    else:
        return 't'
        
print( d_or_t_extended( preceding_vowel_length=175, speaker='Russell' ) )
print( d_or_t_extended( preceding_vowel_length=175, speaker='Garrett' ) )


d
t

Why do we make such computational models of cognition?

These models were pretty simple and we could intuit their behavior pretty easily. But it doesn't take much extra complexity before the models' behaviors become less predictable. At that point, it helps to run an implemented model and see what behaviors follow from the model. Sometimes you'll find that the behavior of the model isn't what you expected, and/or that it doesn't produce the phenomenon that it was supposed to! And sometimes in the process of programming the model, you'll be forced to make explicit some aspect of a theory that was only vague before. Maybe at that point you realize that the theory doesn't actually make sense!

Questions??

We can now also use these two different versions of speech categorization to discuss...

Marr's levels of analysis

(Apologies for this section being text-heavy)

If you read the assigned Bermudez, some of this exposition is going to be a bit of review (which is a good thing!).

David Marr was a vision scientist concerned with specifying what computations the visual system carries out, how it does so, and what the physical basis of the system is.

  1. Computational level: what computation or function is being performed? What is its goal(s) What information does the system start with (inputs), and what information must it produce (outputs)?
  2. Representation and algorithm level: how does the system do what it does? Specifically, what representations does it use and what processes does it employ to build and manipulate the representations? What is the algorithm by which the computation/function from level 1 is accomplished?
  3. Hardware implementation level: how is the system physically realised? A biological brain composed of networks of neurons? A silicon circuitboard (as in a laptop)? How do the neurons or silicon circuits implement the algorithm(s) from level 2?

Marr claimed that (one of) the task(s) of the visual system was to derive a 3d representation and orientation of an object, such that the object could later be recognized (as a dog, cat, whatever). This is basically a computational level theory of the visual system, a claim about what function the system performs.

I won't go into Marr's algorithmic and implementational level theories of that task. Instead, let's shift our attention back to our d_or_t functions and illustrate these levels there.

Both our implemented functions have similar computational level analyses: their task/goal is to determine if the sound was a 'd' or a 't'. The only respect in which they differ at the computational level is the information/inputs they are provided -- our second function has speaker information, while the first does not.

Where they differ more substantially is at the algorithmic level. d_or_t simply checks if the preceding vowel length is above or below a single threshold. But the algorithm of d_or_t_extended also changes the threshold depending on who is speaking.

For the time being, it's hard for us to get much more specific about the implementational level, and that's pretty true of cognitive science in general. We might have fairly elaborate theories of how, for example, we parse an English sentence into a syntactic tree, but we don't have as detailed theories about how a brain does that.

Questions?

Modularity

In the next class, I'm going to lecture on a fundamental theory in cognitive science called Modularity of Mind. Its basic claim is that (portions of) minds are made up of different parts -- called modules -- that have different functions/purposes, and these parts communicate with each other to only a limited extent.

Programming, and functions in particular, will insantiate this concept of modularity nicely -- Python/programming functions are basically modules. Let's think about our 'd' or 't' categorization problem again. To remind you all:


In [3]:
def d_or_t(preceding_vowel_length):
    # imagine that vowel length is in milliseconds
    if preceding_vowel_length > 200: #i'm just guessing that this is a reasonable boundary!
        return 'd'
    else:
        return 't'
    
def d_or_t_extended(preceding_vowel_length, speaker):
    if speaker == 'Russell': # imagining if I was an atypically fast speaker
        boundary = 150
    else:
        boundary = 200

    if preceding_vowel_length > boundary: 
        return 'd'
    else:
        return 't'

Suppose the d/t sound comes after 'rea?' -- the '?' is representing the d/t sound that needs to be categorized. As far as I know, 'read' is a word, but 'reat' is not. So, a categorization function could theoretically rely on that information and be less likely to categorize a sound as 't' after 'rea'.

But both of our current d_or_t models are not sensitive to that information. We could say that they are modularized with respect to word context of the d/t sound.

So of course, the addition of the variable preceding does not change the operation of d_or_t in this next cell.


In [4]:
preceding = 'rea'
whole_word = preceding + d_or_t(150)
print(whole_word)


reat

That shows how the function d_or_t is encapsulated from certain information outside the function. Modularity also often includes the reverse relation: that the function doesn't expose all of its information to the outside. In modularity of mind, we'll learn that that is called cognitive impenetrability.

Our extended function illustrates this. Inside the extended function, a variable called boundary is defined. However, that information only exists inside the function. If we try to access that variable outside the function, we fail.


In [5]:
d_or_t_extended( preceding_vowel_length=175, speaker='Russell' )
print(boundary)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-90ecb57b2d8a> in <module>()
      1 d_or_t_extended( preceding_vowel_length=175, speaker='Russell' )
----> 2 print(boundary)

NameError: name 'boundary' is not defined

So to recap, Python functions and the contexts in which they are called are modularized with respect to one another, at least to varying degrees.

We'll develop this theory further on Wednesday. And we'll see that it's actually a huge debate, raging to this day, the extent to which functions like speech sound categorization are modularized, i.e., sensitive to this or that kind of information.

Questions?

Algorithmic complexity

Let's talk briefly a bit about properties of the second of Marr's levels. That is, properties of particular algorithms. Properties like how much time and memory an algorithm requires.

If you were to take CSE classes, you'd study different algorithms for functions like sorting (e.g., sequences of numbers) or parsing. For example, if you are sorting n items, some algorithms take n * log(n) time, on average, to finish, whereas others take n**2 time (on average).

For this class, we won't get nearly so technical. I just want to illustrate in a general way how the resource requirements of a computational model can depend on its input, and have you all start thinking about what those requirements mean for cognition. In particular, if an algorithm is overly greedy of time and memory, it probably isn't a realistic model of human cognition.

To loosely illustrate this idea of algorithmic complexity, let's try to model that last idea about speech categorization, where categorization depends on whether a particular decision of 'd' or 't' makes a real English word or not.


In [6]:
def d_or_t_final(preceding_vowel_length, preceding_context, english_words):
    # two new variables: preceding_context and english_words
    
    t_completion = preceding_context + 't' # word made by a 't' decision
    d_completion = preceding_context + 'd' # word made by a 'd' decision
    
    # each of these new variables will be true if the completion is in english_words
    # and False otherwise
    d_plausible = d_completion in english_words
    t_plausible = t_completion in english_words
    
    # equivalent to `if (t_possible == True) and (d_possible == False)`
    # in English, if the t_completion is plausible but the d_completion is not...or vice versa
    if t_plausible and not d_plausible:
        return 't'
    if d_plausible and not t_plausible:
        return 'd'

    # if the function has gotten to this point, it means that the t_completion and d_completion
    # are equally plausible, so now the function should just rely on the preceding vowel length
    if preceding_vowel_length > 200: 
        return 'd'
    else:
        return 't'

In [7]:
my_words = ['read','bead','beat','cat'] # ignore the fact that cad is a real if rare word...
d_or_t_final(100, preceding_context='rea',english_words = my_words)


Out[7]:
'd'

Can someone walk me through how, exactly, d_or_t_final produced that output? What are the variables' values at each line of the function?

Let's test out some other possible inputs. We'll use the assert statement. (This will come up on the PS so I'm showing it now.) It basically requires means "This expression better be true or I'm raising an error!".

So the next line works fine.


In [8]:
assert 1 != 0

But this line fails.


In [9]:
assert 1 == 0


---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-34d369bab8db> in <module>()
----> 1 assert 1 == 0

AssertionError: 

So let's try three classes of test cases:

  1. In the first two cases, one word completion is sensible (e.g., 'read') while the other is not (e.g., 'reat'), so the function chooses 't' or 'd' on that basis, even if the vowel length favors the other consonant!
  2. In the next two cases 'beat' and 'bead' are both equally good, so the function relies on preceding_vowel_length.
  3. In the final two cases, neither completion is in my_words, so the function relies on preceding_vowel_length

In [10]:
assert d_or_t_final(150, preceding_context='rea',english_words = my_words) == 'd'
assert d_or_t_final(250, preceding_context='ca', english_words = my_words) == 't'

assert d_or_t_final(150, preceding_context='bea',english_words = my_words) == 't'
assert d_or_t_final(250, preceding_context='bea',english_words = my_words) == 'd'

assert d_or_t_final(150, preceding_context='scra',english_words = my_words) == 't'
assert d_or_t_final(250, preceding_context='scra',english_words = my_words) == 'd'

If our function is behaving properly, the above cell shouldn't raise any errors.

Let's now think of the time complexity of d_or_t_final. Does anyone have any guesses about which input to the function could get bigger and bigger and bigger, and hence slow down the model? (As a reminder, the inputs are preceding_vowel_length, preceding_context, and english_words.) Why would size of that particular input influence the speed of the model? (This very well might be hard to guess!)

.

.

.

.

.

The size of english_words will influence how long the model takes to run. This is because in the first two if statements, the program has to check the words in english_words one by one and check if the there's a match to the two possible completions in it. The longer the list is, the longer the search takes.

This can be deduced from reasoning about the code, but we can also simply show it with testing. We'll just record how long it takes the function to run with english_words of two different sizes!

(Don't worry about some of the new functionality in this next cell like import time.)


In [11]:
import time

start_time = time.time()
my_words = ['read','bead','beat','cat']
categorization = d_or_t_final(250, preceding_context='scra',english_words = my_words)
duration   = time.time() - start_time

print(duration)


0.0001270771026611328

But if my_words was a very long list.....again, don't worry about how exactly I'm reading in this file of words...


In [12]:
with open('wordlist.txt') as f:
    my_words = [x.strip() for x in f.readlines()]
print(my_words[:10])
print(len(my_words))


['a', 'a-horizon', 'a-ok', 'aardvark', 'aardwolf', 'ab', 'aba', 'abaca', 'abacist', 'aback']
69905

Almost 70,000 words now...


In [13]:
start_time     = time.time()
categorization = d_or_t_final(250, preceding_context='scra', english_words=my_words)
new_duration   = time.time() - start_time

print(new_duration)
print(new_duration/duration)


0.003420114517211914
26.913696060037523

That took way longer -- ~26 times longer!

The homework asks you to reason about a similar computational model of cognition. As with this case, you can reason about the model by analyzing its code, or you can test it empirically.

Questions??

Wrapping up

The homework will ask you to analyze similar, fairly simple cognitive models in terms of Marr's Levels, modularity, run-time, and more: it'll ask you to muse on what's unrealistic about a model, and it'll ask you to finish programming a model.

I urge everyone to start trying to work with the problem set in the jupyter notebook, and probably also try the optional exercises. Please come to office hours before it's too late!

Questions???