Monday Musing: The Palm Pilot and the Human Brain, Part III

Part III: How Brains Might Work, Continued…

In Part I of this twice-extended column, I tried to explain how it is that very complex machines such as computers (like the Palm Pilot) are designed and built by using a hierarchy of concepts and vocabularies. I then used this idea to segue into how attempts to understand the workings of the brain must reverse-engineer the design which has been provided by natural selection in that case, and in Part II, began a presentation of an interesting new theory of how the brain works put forth in his book On Intelligence by the inventor of the Palm Pilot, Jeff Hawkins, who is also a respected neuroscientist. Today, I want to wrap up that presentation. While it is not completely necessary to read Part I to understand what I will be talking about today, it is necessary to read at least Part II. Please do that now.

Last time, at the end of Part II, I was speaking of what Hawkins calls invariant representation. This is what allows us, for example, to recognize a dog as a dog, whether it is a great dane or a poodle. The idea of “dogness” is invariant at some level in the brain, and it ignores the specific differences between different breeds of dog, just as it would ignore the specific differences in how the same individual dog, Rover say, is presented to our senses in different circumstances, and would recognize it as Rover. Hawkins points out that this sense of invariance in mental representation has been remarked for some time, and even Plato’s theory of forms (if stripped of its metaphysical baggage) can be seen as a description of just this sort of ability for invariant representation.

This is not just true for the sensory side of the brain. The same invariant representations are present at the higher levels of the motor side. Imagine signing your name on a piece of paper on a two inch wide space. Now imagine signing your name on a large blackboard so that your signature sprawls several feet across it. Despite the fact that completely different nerve and muscle commands are used at the lower levels to accomplish the two tasks (in the first case, only your fingers and hand are really moving while in the second case those parts are held still while your whole arm and other parts of your body move), the two signatures will look very much the same, and could be easily recognized as your signature by an expert. So your signature is represented in an abstract way somewhere higher up in your brain. Hawkins says:

Memories are stored in a form that captures the essence of relationships, not the details of the moment. When you see, feel, or hear something, the cortex takes the detailed, highly specific input and converts it to an invariant form.It is the invariant form that is stored in memory, and it is the invariant form of each new input pattern that it gets compared to. Memory storage, memory recall, and memory recognition occur at the level of invariant forms. There is no equivalent concept in computers. (On Intelligence, p. 82)

We’ll be coming back to invariant representations later, but first some other things.

PREDICTION

Imagine, says Jeff Hawkins, opening your front door and stepping outside. Most of the time you will do this without ever thinking about it, but suppose I change some small thing about the door: the size of the doorknob, or the color of the frame, or the weight of the door, or I add a squeak to the hinges (or take away an existing squeak). Chances are you’ll notice right away. How do you do this? Suppose a computer was trying to do the same thing. It would have to have a large database of all the door’s properties, and would painstakingly compare every property it senses with the whole database, but if this is how our brains did it, then, given how much slower neurons are than computers, it would take 20 minutes instead of the two seconds that it takes your brain to notice anything amiss as you walk through the door. What is actually happening at all times at the lower level sensory portions of your brain is that predictions are being made about what is expected next. Visual areas are making predictions about what you will see, auditory areas about what you will hear, etc. What this means is that neurons in your sensory areas become active in advance of actually receiving sensory input. Keep in mind that all this occurs well below the level of consciousness. These predictions are based on past experience of opening the door, and span all your senses. The only time your conscious mind will get involved is if one or more of the predictions are wrong. Perhaps the texture of the doorknob is different, or the weight of the door. Otherwise, this is what the brain is doing all of the time. Hawkins says the primary function of the brain is to make predictions and this is the foundation of intelligence.

Even when you are asleep the brain is busy making its predictions. If a constant noise (say the loud hum of a bad compressor in your refrigerator) suddenly stops, it may well awaken you. When you hear a familiar melody, your brain is already expecting the next notes before you hear them. If one note is off, it will startle you. If you are listening to a familiar album, you are already expecting the next song as one ends. When you hear the words “Please pass the…” at a dinner table, you simultaneously predict many possible words to follow, such as “butter,” “salt,” “water,” etc. But you do not expect “sidewalk.” (This is why a certain philosopher of language rather famously managed to say “Fuck you very much” to a colleague after a talk, while the listener heard only the expected thanks.) Remember, predictions are made by combining what you have experienced before with what you are experiencing now. As Hawkins puts it:

These predictions are our thoughts, and, when combined with sensory input, they are our perceptions. I call this view of the brain the memory-prediction framework of intelligence. (Ibid, p. 104)

HOW THE CORTEX WORKS

Let us focus on vision for a moment, as this is probably the best understood of the sensory areas of the brain. Imagine the cortex as a stack of four pancakes. We will label the bottom pancake V1, the one above it V2, the one above that V4, and the top one IT. This represents the four visual regions involved in the recognition of objects. Sensory information flows into V1 (over one million axons from your retinas feed into it), but information also flows down from regions to the one below. While parts of V1 correspond to parts of your visual field in the sense that neurons in a part of V1 will fire when a vertain feature (say an edge) is present in a certain part of the retina, at the topmost level, IT, there are cells which become active when a certain object is anywhere in your visual field. For example, a cell may only fire if there is a face present anywhere in your visual field. This cell will fire whether the face is tilted, seen at an angle, light, dark, whatever. It is the invariant representation for “face”. The question, obviously, is how to get from the chaos of V1 to the stability of the representation at the IT level.

The answer, according to Hawkins, lies in feedback. There are as many or more axons going from IT to the level below it, as there are in the upward direction (feedforward). At first people did not pay much attention to these feedback connections, but if you are going to be making predictions, then you are going to have to have axons going down, as well as up. The axons going up carry information on what you are seeing, while the axons going the other way carry information on what you expect to see. Of course, exactly the same thing occurs in all the sensory areas, not just vision. (There are also association areas even higher up which connect one sense to another, so that, for example, if I hear my cat meowing and the sound is approaching from around the corner, then I expect to see it in the next instant.) Hawkins’s claim is that there is a sort of invariant representation at each level of the cortex, of the more fragmented sensory input from the level below. It is only when we get to the levels available to consciousness like IT that we can give these invariant representations easily understood names like “face.” Nevertheless, V2 forms invariant representations of what V1 is feeding it, by making predictions of what should come in next. In this way, each level of cortex develops a sort of vocabulary in terms that are built upon repeated patterns from the layer below. So now we see that the problem was not how to construct invariant representations in IT, like “face,” from the three layers below it. Rather, each layer forms invariant representations based on what comes into them. In the same way, association layers above IT may make invariant representations of objects based on the input of multiple senses. Notice that this also goes along well with Mountcastle’s idea that all parts of the cortex basically do the same thing! (Keep in mind that this is a simplified model of vision, ignoring much complexity for the sake of for expository convenience.)

In other words, every single cortical region is doing the same thing: it is learning sequences of patterns coming in from the layer below and organizing them into invariant representations that can be recalled. This is really the essense of Hawkins’s memory-prediction framework. Here’s how he puts it:

Each region of cortex has a repertoire of sequences it knows, analogous to a repertoire of songs… We have names for songs, and in a similar fashion, each cortical region has a name for each sequence it knows. This “name” is a group of cells whose collective firing represents the set of objects in the sequence… These cells remain active as long as the sequence is playing, and it is this “name” that gets passed up to the next region in the hierarchy. (Ibid. p. 129)

This is how greater and greater stability is created as we move up in the hierarchy, until we get to stages which have “names” for the common objects of our experience, and which are available to our conscious minds as things like “face.” Much of the rest of the book is spent on describing details of how the cortical layers are wired to make all this feedforward and feedback possible, and you should read the book if you are interested enough.

HIERARCHIES AGAIN

As I mentioned six weeks ago when I wrote Part I of this column, complexity in design (whether done by humans or by natural selection) is achieved through hierarchies which build layer upon layer of complexity. Hawkins takes this idea further and says that the neocortex is built as a hierarchy because the world is hierarchical, and the job of the brain, after all, is to model the world. For example, a person is usually made of a head, torso, arms, legs, etc. The head has eyes, a nose, a mouth, etc. A mouth has lips, teeth, and so on. In other words, since eyes and a nose and a mouth occur together most of the time, it makes sense to give this regularity in the world (and in the visual field) a name: “face.” And this is what the brain does.

Have a good week! My other Monday Musing columns can be seen here.