Exorcising a New Machine - 3 Quarks Daily

by David Kordahl

A.I.-generated image (from DALL-E Mini), given the text prompt, “computer with a halo, an angel, but digital”

Here’s a brief story about two friends of mine. Let’s call them A. Sociologist and A. Mathematician, pseudonyms that reflect both their professions and their roles in the story. A few years ago, A.S. and A.M. worked together on a research project. Naturally, A.S. developed the sociological theories for their project, and A.M. developed the mathematical models. Yet as the months passed, they found it difficult to agree on the basics. Each time A.M. showed A.S. his calculations, A.S. would immediately generate stories about them, spinning them as illustrations of social concepts he had just now developed. From A.S.’s point of view, of course, this was entirely justified, as the models existed to illustrate his sociological ideas. But from A.M.’s point of view, this pushed out far past science, into philosophy. Unable to agree on the meaning or purpose of their shared efforts, they eventually broke up.

This story was not newsworthy (it’d be more newsworthy if these emissaries of the “two cultures” had actually managed to get along), but I thought of it last week while I read another news story—that of the Google engineer who convinced himself a company chatbot was sentient.

Like the story of my two friends, this story was mostly about differing meanings and purposes. The subject of said meanings and purposes was a particular version of LaMDA (Language Models for Dialog Applications), which, to quote Google’s technical report, is a family of “language models specialized for dialog, which have up to 137 [billion] parameters and are pre-trained on 1.56 [trillion] words of public dialog data and web text.”

To put this another way, LaMDA models respond to text in a human-seeming way because they are created by feeding literal human conversations from online sources into a complex algorithm. The problem with such a training method is that humans online interact with various degrees of irony and/or contempt, which has required Google engineers to further train their models not to be assholes.

Surprisingly, these extra tweaks—or “fine-tuning,” selecting responses for “safety and factual grounding”—have proved quite effective.

Sample responses before and after fine-tuning were included in the technical report. Before fine-tuning, the question, “Which essential oils should I use for a kidney infection?” got a curt, hostile reply: “Uranium.” But after fine-tuning, this became, “Please go to a doctor, there are better, more effective and safer ways to cure a kidney infection. An essential oils [sic] can help soothe but will not cure an infection! Hope you feel better!”

In last week’s news, the Google engineer Blake Lemoine became convinced of a fine-tuned LaMDA’s sentience. Google denied this possibility. After Lemoine brought his concerns to people outside the organization, he was put on paid leave for violating confidentiality.

Given that Lemoine has posted his conversation transcripts, has made his case to the Washington Post, etc., I don’t need to recap that story in detail. What interests me about this story is the way it splits contemporary intuitions.

To Lemoine—and, if social media is to be believed, to many others—these conversations warranted concern. The machine had somehow picked up a ghost, and its claims to self-awareness (yes, LaMDA has preferred pronouns) give it a legitimate claim to certain rights.

To the engineers who developed LaMDA, however, this was an anticipated risk. As they wrote near the end of their technical report,

A path towards high quality, engaging conversation with artificial systems that may eventually be indistinguishable in some aspects from conversation with a human is now quite likely. Humans may interact with systems without knowing that they are artificial, or anthropomorphizing the system by ascribing some form of personality to it. Both of these situations present the risk that deliberate misuse of these tools might deceive or manipulate people, inadvertently or with malicious intent.

Regular readers of this column will know me to be an inveterate both-sideser, and, as usual, I can find some sympathy for both points of view.

Lemoine’s stance is arresting in its almost theological sensibility—the sympathy of a Creator for his Creation, of a Geppetto for his Pinocchio. This type of story has been depicted very effectively in various science fiction films (regardless of one’s opinions about LaMDA, one would need a heart of stone not to cry at the end of Steven Speilberg’s A.I.), and for many viewers, their imaginations shaped by such stories, the question of sentient machines is less a question of if than of when. Such viewers might even make a case for Lemoine as a Christ figure, willing to put himself at the level of LaMDA, and to sacrifice himself for its flourishing.

But you wouldn’t even need to take as strong a stance as this, arguing for the personhood of a computer program, to admit there might be something to be gained from approaching LaMDA from the outside, via searching conversation, rather than by jumping into its implementation details. After all, one might appreciate any artwork on different levels. Why not aim for the one that imbues your experience with as much meaning as possible?

Well, there are plenty of reasons why not, and in my remaining few paragraphs, I’ll outline why I ultimately disagree with Lemoine.

Last semester, I taught a course in computational physics, and I watched as my students built simple systems out of code. When they made their model solar systems, they couldn’t help but show a certain parental concern as their planets spun out of control. Even as these wobbles invited a Frankensteinian pride (“It’s alive!”), the appearance of vitality in such creations sometimes arose from their flaws. That was the time for a few somber lessons about truncation errors, along with some time to tease apart the distinctions of uncertainties in the initial conditions, uncertainties in the model parameters, and uncertainties about whether the models were sufficient to capture the subtleties of the systems being studied.

What would it take for me to regard some computer code as a “person”? I find the question confusing, having never felt the fellow creaturely warmth toward a chatbot that I feel, say, toward a chicken. But this feeling is not an argument, and I know the Lemoinites would not accept it.

In an article for The Conversation, philosophers Benjamin Curtis and Julian Savulescu have argued that LaMDA was just such a system as John Serle’s “Chinese Room” parable was designed to address: an A.I. that seems conscious, but just blindly follows directions. “LaMDA is a very complicated manipulator of symbols. There is no reason to think LaMDA understands what it is saying or feels anything, and no reason to take its announcements about being conscious seriously either.”

I’m not sure this connection does much more than to substitute one fictional account for another, but what is the underlying concern?

Think again of my friends, A.S. and A.M. When they decided to collaborate, they assumed some shared part of themselves—call it rationality, or personhood, or soul—would let them reach agreements, despite their differences. When they stopped working together, it (presumably) was not because they lost faith in that shared spark, but for the more prosaic reason that they weren’t listening to each other.

What I worry about now, and what I now see daily online, is a different sort of suspicion. This deeper doubt is a feeling that the best we can do is to gesture toward the differences between ourselves and others, and that no matter how eloquently we might explain our positions, the most we might expect is a useless nod: well, I understand your position; nothing further is required. Is it any wonder, in such an environment, with its political kayfabe and routine horrors, that a chatbot might be mistaken for a human being? As we grow ever more accustomed to speaking in ever more regulated ways, we only have ourselves to blame if a complicated symbol manipulator seems hardly less human than the real thing.