What is Thought that a Large Language Model Ought to Exhibit (But Won’t)?

by David J. Lobina

Not looking good.

Artificial General Intelligence, however this concept is to be defined exactly, is upon us, say two prominent AI experts. Not exactly an original statement, as this sort of claim has come up multiple times in the last year or so, often followed by various qualifications and the inevitable dismissals (Gary Marcus has already pointed out that this last iteration involves not a little post-shifting, and it doesn’t really stand up to scrutiny, anyway).

I’m very sceptical too, for the simple reason that modern Machine/Deep Learning models are huge correlation machines and that’s not the sort of process that underlies whatever we might want to call an intelligent system. It is certainly not the way we know humans “think”, and the point carries yet more force when it comes to Language Models, those guess-next-token-based-on-statistical-distribution-of-huge-amounts-of-data systems.[1]

This is not to say that a clear definition of intelligence is in place, but we are on firmer ground when discussing what sort of abilities and mental representations are involved when a person has a thought or engages in some thinking. I would argue, in fact, that the account some philosophers and cognitive scientists have put together over the last 40 or so years on this very question ought to be regarded as the yardstick against which any artificial system needs to be evaluated if we are to make sense of all these claims regarding the sapience of computers calculating huge numbers of correlations. That’s what I’ll do in this post, and in the following I shall show how most AI models out there happen to be pretty hopeless in this regard (there is a preview in the photo above).

The notion of a “thought” is most appropriately identified with what philosophers usually call content, or a proposition – the sort of thing that a simple statement such as James Joyce is the author of Stephen Hero denotes, and in this sense the object of our beliefs and the like. Or put it another way, if one believes that James Joyce wrote Stephen Hero, then one is holding or entertaining the proposition that JAMES JOYCE IS THE AUTHOR OF STEPHEN HERO (I am not shouting; I’m writing propositions in capital letters). In this sense, a proposition can be understood as the mental act of assigning a property (in the case at hand, authorship of Stephen Hero) to a mental object (viz, one’s internal representation of James Joyce, or JAMES JOYCE).

To have a thought, according to this view, is to entertain a proposition (or a group of propositions), while “to think” is to combine propositions in various ways, from employing so-called logical connectives such as and and or to the premises-and-conclusion organisation typical of reasoned thought. At first sight, the internal structure of propositions appears to be the key issue. Human cognition exhibits a high degree of flexibility and creativity in what is usually termed belief fixation, as evidenced in the very common phenomenon in which different types of perceptual inputs – auditory, visual, what have you – can be combined with each other and with many other beliefs during the construction of a thought, and this would mean that propositions are decomposable into atomic elements of various kinds. An explanation of cognitive flexibility may be approached by taking the constituents of thought to be what philosophers and psychologists call concepts, the mental particulars that underlie propositions. Concepts are abstract, therefore amodal (not attached to any modality, be this visual or aural, or else), stable and thus re-usable, allowing for the combination of mental representations into ever more complex representations.

The sort of mental reality I envision for propositions and concepts is a steady and permanent one; namely, they must constitute some sort of structure in long-term memory, a type of mental representation rather than a token, even though these representations would be tokened in causal mental processes such as reasoning, with belief fixation being the most general one. In this sense, by the phrase mental particular I do not mean particular tokens of concepts, but simply the much more general point that the concept SAINT, for instance, is a different mental particular to the concept SAGE, even if both concepts can be combined into a more complex mental representation, a proposition such as THE ISLAND OF SAGES AND SAINTS, for instance.

I certainly also do not mean to suggest that all possible propositions are stored in long-term memory, even though many of them would be; all we need here is a set of concepts and some sort of combinatory/compositional system to create complex representations out of them, but the particular tokening of the thought THE ISLAND OF SAGES AND SAINTS would still remain distinct from the mental particular of the corresponding thought type.

At the very least, then, thought necessitates structured propositions, but what does this entail exactly? Gareth Evans’s Generality Constraint is a good place to start. Put simply, this constraint states that thoughts must be structured, not only in terms of their internal elements, but also in terms of the capacity to exercise what Evans called ‘distinct conceptual abilities’. By this turn of phrase, Evans is drawing attention to the apparent fact that if one can entertain a thought in which a given property, call it F, can be ascribed to one individual, a, and another property, this time call it G, to an individual b, thereby putting together the thought that Fa and Gb (note the connective combining the two propositions here), then one can also entertain the thought that Fb and Ga. And so on and on; or to use a concrete example, take a to be a cat, F the property of being an animal, G the property of being a mammal, H of being a feline, etc. We can think of a’s in all these terms, and more.

This is clearly one of the main abilities of human thought, but there are some other general principles that are just as important (and related to this Constraint). The philosopher John Campbell has proposed two other constraints in relation to our conceptual repertoire. In order to apprehend a concept, Campbell argues, one must be able to grasp it from within a system of concepts, the “surround” of a concept making it intelligible in the first place, and he calls this the intelligibility constraint. That seems reasonable enough, for the truth or interpretation of a concept may well depend upon properties of the conceptual repertoire overall, as in the example of the concept CAT (the properties of being an animal and being a feline are naturally interrelated).

Further, and in relation to Evans’s Generality Constraint, Campbell proposes the permutability constraint, according to which one should be able to grasp a wide range of thoughts from a set of “propositional attitudes” (e.g., the belief that P, the desire that Q, etc.) by permuting the internal elements in various ways, this permutability perhaps accounted for by a computational operation of the mind (Campbell takes this principle to underlie Evans’s own constraint, in fact). And when the intelligibility, generality, and permutability constraints are all put together, Campbell submits, one gets the conceptual creativity I have described before, which is party based on the attribute that thoughts can be combinations of rather complex concepts.

These mental abilities are not too far away from what the philosopher Jerry Fodor called the systematicity of thought, the claim that our ability to entertain some thoughts is intrinsically connected to our ability to entertain similar thoughts. This ability is argued to be a reflection of constituent structure, given that the stated similarity amongst thoughts is a matter of the form these thoughts have, and not of their actual content. Thus, if one can entertain the thought that if P, then Q, one can also entertain the thought that if Q, then P – this is not due to the content of the propositions P and Q, but on account of the “form” the representations involving these propositions take.

In combination, these constraints and principles yield a number of desiderata that any supposedly intelligent system must exhibit to be regarded as, well, intelligent. To wit, any system underlying thought must be able to clearly demonstrate these four abilities:

  1. appropriately represent the contents of thoughts (as in representing possible and actual affairs, and discriminating between the two);
  2. accurately distinguish the contents of different thoughts (as in being able to distinguish Fa and Gb from Fb and Ga, for instance);
  3. faithfully represent the propositional attitudes (e.g., the belief or desire that P);
  4. play a causal role in mental processes such as reasoning, remembering, and deciding.

As such, these four requirements provide the relevant yardstick against which AI systems ought to be measured. I shall undertake to do this for a relevant sample of AI systems in next month’s post; in the remainder here, I would like to describe these four desiderata a bit better and provide some further pointers regarding what human thought is like.

The first desideratum, content representation, can be achieved by the combination of abstraction (the mental particulars I have been calling concepts), the Generality Constraint (allowing for a fine-grained view of concepts), systematicity (pointing to the form of constituent structure), and the intelligibility constraint (the latter placing a given concept against other mental particulars).

The second, content differentiation, would instead necessitate a different subset of the principles and constraints I have described, as it would require the permutability constraint, and a slightly different understanding of abstraction (in this case, abstraction would refer to the ability to employ two concepts as a single unit for further computation, say as in the complex but primitive concept, PET FISH, which is not just the combination of the concepts PET and FISH).

Regarding the representation of the propositional attitudes, this can be brought about in a similar fashion; the attitudes would simply involve different concepts – a belief is a different concept than a desire – and therefore different structural relations with their respective propositions. But I won’t be too concerned with this here or next month (there is no reason to believe, in any case, that any LLM has an understanding of beliefs or desires).

Finally, conceptual representations must be such that mental processes can be supported at all, and that imposes pretty stringent (structural) requirements on the so-manipulated representations. Indeed, it must be the case that thought processes manipulate structured objects that combine and give rise to new objects in ways that respect their structural properties and interrelations; it is this very aspect of the systematicity property that forces us to postulate structured constituents, for the chains of day-to-day inferences we typically carry out would not be licensed otherwise. For instance, and to offer a pretty mundane example here, when crossing the street one may not explicitly verbalise what one is doing, but the underlying cognition is not trivial (e.g., if I cross the street after this car, I will be safe, etc.).

As mentioned, the third desideratum won’t feature as much as the other three, and it is indeed the other desiderata – content representation and differentiation, and its role in belief fixation – that will engage me the most, the actual point under study being whether AI systems are able to represent content accurately, distinguish similarly-structured contents appropriately, and subsume the relevant mental processes.

Rather importantly, the properties I have listed are not too far from what other scholars have seen in thought, and this will bode well for the evaluation I will carry out next month. To offer but a small sample. The philosopher Christopher Peacocke has pointed to four Fregean properties of thought: it must have truth values; be composite and structured; be the objects of the attitudes; and be possible to be entertained by two different thinkers. Similarly, another philosopher, Peter Carruthers, talks of thoughts as being discrete, semantically-evaluable, causally-effective states possessing component structure. And the linguist Ray Jackendoff lists six features of meaning, a concept closely connected to propositions: it must be linked to pronunciation; be compositional; have referential function; possess inferential roles; be unconscious; and be effable (the property according to which any content expressible in one language is also expressible in any other language). How will modern AI systems such as LLMs fare, I wonder.

 


[1] It can be worse: Geoffrey Hinton has recently given what can only be regarded as bizarre answers to such questions as to whether current AI models exhibit intelligence in the way humans do (clearly yes, says Hinton) or whether they will ever develop self-consciousness (certainly, says Hinton). If anything, this sorry display shows once again that a sophisticated thinker in one field (in this case, Machine/Deep Learning) can easily spout silly nonsense in another (viz., the study of cognition).