AI and Consciousness

by Dwight Furrow

The question of whether AI is capable of having conscious experiences is not an abstract philosophical debate. It has real consequences and getting the wrong answer is dangerous. If AI is conscious then we will experience substantial pressure to confer human and individual rights on AI entities, especially if they report experiencing pain or suffering. If AI is not conscious and thus cannot experience pain and suffering, that pressure will be relieved at least up to a point.

It strikes me as extraordinarily dangerous to give super-human intelligence the robust autonomy entailed by human rights. On the other hand, if we deny such rights to AI and it turns out to be conscious, we incur the substantial moral risk of treating a conscious being as a labor-saving device. A fully sentient, super-human intelligence poorly treated will not be happy.

I hear and read a good deal of discussion coming out of Silicon Valley about whether AI is at least potentially conscious. The problem is that—and I mean this quite literally—no one knows what they’re talking about. Because no one knows what consciousness is. This is an enormously complex question with a very long history of debate in philosophy and the sciences. And we are not close to resolving it. There are at least eight main theories of consciousness and countless others striving to get attention. We are unlikely to settle this question quickly so it’s important not to make unwarranted assumptions about such a consequential issue.

This is not the place to articulate all the nuances of that debate and its implications for AI, but I think there is a way of bringing some focus to the question by organizing these various theories as competing views on the status of subjective experience, or to use the more or less technical term, the status of “qualia.”

The American philosopher Thomas Nagel explained the problem of qualia in a particularly insightful way. In his 1974 article “What Is It Like to Be a Bat?” Nagel argued that consciousness has an irreducibly subjective character—what it is like to be a particular organism. He claimed that physicalist theories that treat the mind as identical to the brain fail to capture this inner perspective, since no objective, third-person description of neural architecture can explain the first-person experience of another creature. He chose bats as an example because, unlike humans, they navigate through echolocation. Given that difference, we can never know what it’s like to be a bat. Science can describe how echolocation works but is not equipped to describe what it feels like to use echolocation to move about in the world. Nagel concludes that an account of the mind must acknowledge that subjective experience (qualia) is a fundamental aspect of reality, resisting reduction to physical or functional states. Without acknowledging the reality of first person, subjective experience, any theory of consciousness remains incomplete, lacking access to this essential dimension of any sentient creature’s experience.

But was Nagel correct?

With regard to this problem of consciousness as private, subjective experience there are basically three positions one can adopt, each with different implications for AI consciousness.

(1) Qualia Exist but Are Reducible to Functional Organization

This collection of theories tries to preserve the reality of subjective experience—there really is “something-that- it’s-like” to see red, feel bored, listen to Radiohead, or to be a bat—but they also insist that these experienced textures are not fundamental features of reality. Instead, qualia are emergent properties of certain patterns of cognitive or computational structure.

There are three versions of what kind of structure is required to generate qualia. Higher-Order Thought theorists argue that the feel of a state is explained by your higher-order representation of being in it. You feel pain because you’re aware of yourself as feeling pain. Thus, consciousness requires the capacity for self-reflection. According to this theory subjectivity is important. The higher-order thought must not only represent the lower-order state but also embed it in a first-personal framework: “I am having this experience.” Without that first person perspective there is no consciousness.

Information flow theories, by contrast, argue that consciousness is a matter of the functional architecture of the brain and how information moves through that architecture. There are two variants of this view. Global Workspace Theory argues that what it is like to be conscious just is having information globally broadcast across a variety of neural (or computational) modules. That is to say that there are subsystems of the brain—sensory systems, memory, cognition, etc.—and when the outputs of a sufficient number of those subsystems are simultaneously talking to each other in a shared workspace, consciousness emerges. The second variant, Integrated Information Theory argues that a system is conscious to the extent it constitutes a single, unified entity with a high degree of intrinsic causal interaction sufficient to generate that unity. Qualia emerge from that integrated causal interaction.

And finally, there are representational theories that belong in this category: experienced conscious states represent the world in particular ways and the “feel” of an experience is identical to what consciousness represents. Thus the ‘feel’ of an experience is determined by the way consciousness represents the world, not by any internal mental “glow.” When I see a red apple, I am in a conscious state that is about that red apple, and what it’s like to experience a red apple is nothing more than that aboutness relation. If my conscious state were about something else, the sense of “what it’s like to experience it” would be different.

These theories in this category are most friendly to the idea that an AI might be conscious. If you build a machine with the right architecture: self-reflective with a sense of self; a workspace where subsystems interact or that has a high degree of causal interaction; or is capable of having mental states that are about something, then a machine is conscious.

Most theorists think current Large Language Models lack this architecture. They lack the robust memory and feedback loops to achieve the degree of causal interaction or reflective sense of self required by these theories. LLM’s are largely, at this point, feed forward mechanisms, anticipating the next word in a sentence (although the latest models appear to have much improved memory.) But future versions of AI may well have an architecture with sufficient causal integration to achieve consciousness.

However, these theories in general are unsatisfying as theories of consciousness. I call them jack-in-the-box theories. Get the box wired in just the right way and consciousness just pops up when you open it. What I mean is that there is an explanatory gap. Even if you describe all the information flow, representations, or self-reflective behavior, it remains tempting to ask: “But why should this architecture feel like this?” Why should we think a machine with an architecture could feel anything at all? We have good reason to think feelings like pleasure, pain, and what it’s like to be x are the result of our biological make up; the stuff we’re made out of matters. These theories in this category tend to be functionalist arguing that it’s the organization of matter into functional relations not the kind of matter being organized that creates consciousness. But feeling states are an obstacle to this group of theories because it isn’t obvious how a non-biological being could have them.

(2) Qualia Exist, Are Central to Consciousness, and Are Not Reducible to Functional Organization

This second group of theories treat qualia not as conceptual artifacts or reducible constructs but as primitive features of reality—unfolding within, or emerging from, but not identical to, functional organization. This is what Nagel was arguing. Most contemporary versions of this view hold that it is biology, not functional organization, that generates consciousness. It’s the specific properties of biological neurons, not only their organization, that produces consciousness. Properly organized silicon chips are not genuine neurons.

This group of theories includes:

Phenomenology: Consciousness just is first-person, biologically embodied, lived experience. Any account that leaves this out has failed before it starts.

Panpsychism: Subjectivity is baked into the world’s fabric; even atoms have some proto-consciousness. Human consciousness is a high-level organization of this experiential base. The idea here is that there is no weird emergence of consciousness from physical stuff. Even physical stuff at the level of atomic structure has some degree of consciousness. Full-blown human consciousness just has more of it.

Some versions of Integrated Information Theory: I discussed this theory in the first category, but some versions belong here because they argue that qualia are not caused by integrated information systems. They are the intrinsic feature of that integrated information system. There is a deep identity between experience and physical structure, but that physical structure is more than an organization of abstract functions. Again, biological neurons are likely necessary for that structure.

These theories are highly resistant to AI consciousness. No matter how smart or eloquent an AI is, unless it has subjective, first-person, embodied experience as a fundamental feature of reality it is not conscious. Functional mimicry is not sufficient. Qualia cannot be reduced to functional organization.

But these theories risk being scientifically intractable. Science requires quantifiable, observable, publicly available phenomena to study. It is not equipped to study subjectivity because scientific observation lacks direct, first-person access to subjective states. No one feels my pain except me and so what it’s like to feel pain, or anything else for that matter, is not directly accessible. Observing behavior is not direct access.

Critics often accuse these views of endorsing a view of consciousness that is inherently mysterious because it cannot be explained by science. But subjectivists reply that they are the hard-headed realists. You can’t make subjectivity go away just because it’s inconvenient for your theory.

(3) Qualia Are Illusions or are Not Central to Consciousness

This third position is iconoclastic: it sees the whole debate around qualia as a philosophical misfire. It grants that people think they have private inner experiences but denies that this is metaphysically or scientifically meaningful. Consciousness, on this view, is entirely explainable in terms of functional, behavioral, or representational capacities and qualia are simply irrelevant.

There are various versions of this. Illusionism argues that qualia are an illusion, a trick of introspective self-modeling, not a real property. The “what-it’s-like” talk is just a poetic label for how complex self-monitoring systems narrate their states. We tell stories about what it’s like to experience a red object but there is no fact behind the story. To be conscious is just saying and believing we have consciousness but there is nothing beyond the saying or believing.

Some hard-line representationalists and functionalists argue that the feel of red is just the disposition to behave and report in red-typical ways. There is no “red quale” over and above how a system behaves in response to red. They advocate focusing on what systems do, not what they seem to feel. According to this view, a machine is conscious once it claims to be, convincingly and spontaneously, in the same way we are. Qualia never existed so replicating our behaviors and self-models is all that matters. AI will be conscious as soon as it believes it is.

The main challenge is that this view appears to contradict lived reality. If anything seems beyond doubt, it’s that experience feels like something. This view risks denying consciousness only to save a scientific worldview. Many find this position too revisionist to accept, even if it’s theoretically parsimonious.

Pain is a particular hurdle for these theories. According to “illusionists” what we call pain is actually a series of objective events: tissue damage, nerve firing, and pain reactions like grimacing or more intense, aversive physiological responses. From this perspective, you are in pain if you are in a state that triggers avoidance behavior and aversive cognitive processing. The “story” we tell about a private, ineffable feeling is simply a distorted way of describing these real physical states.

This is dangerously close to claiming the pain you feel isn’t real; it’s just a story you tell about tissue damage. Don’t ask me how philosophers can persuade themselves of such a thing.

The fate of qualia is the fate of consciousness. If we consider consciousness to be essentially having private subjective experiences, then any theory that denies qualia is amputating the subject from the outset. If, however, we believe that subjective feel is an emergent artifact of systems that process information in complex self-referential ways, then qualia can be explained or simulated into existence.

And if we believe qualia are a mirage, a poetic ghost in the machine of cognition, then the last word on consciousness might belong not to phenomenology or metaphysics but to engineering and the machine that one day tells us, with apparent sincerity, “I feel.”

So which metaphysics of mind are you willing to bet the soul of the machine on?