The Midas Machine - 3 Quarks Daily

by Katalin Balog

Midas Washing at the Source of the Pactolus by Bartolomeo Manfredi

In a recent bestseller, Eliezer Yudkowsky and Nate Soares argue that artificial superintelligence (ASI), if it is ever built, will wipe out humanity. Unsurprisingly, this idea has gotten a lot of attention. People want to know if humanity has finally gotten around to producing an accurate prophecy of impending doom. The argument is based on an ASI that pursues its goals without limit—no satiation, no rest, no stepping back to ask what for? It seems like a creature out of an ancient myth. But it might be real. I will consider ASI in the light of two stories about the ancient king Midas of Phrygia. But first, let’s see the argument.

What is an ASI?

An ASI is supposed to be a machine that can perform all human cognitive tasks better than humans. The usual understanding of this leaves out a vast swath of “cognitive tasks” that we humans perform: think of experiencing the world in all its glory and misery. Reflecting on this experience, attending to it, appreciating it, and expressing it are some of our most important “cognitive tasks”. These are not likely, to use an understatement, to be found in an AI. Not just because AI consciousness is rather implausible to ever emerge, but also because, even if AI were to become conscious, it would not do these things, not if its developers stuck to the goal of creating helpful assistants for humans. They are designed to be our servants, not autonomous agents who resonate with and appreciate the world.

OK, but what about other, more purely intellectual tasks? LLMs are already very competent in text generation, math, and scientific reasoning, as well as many other areas. While doing all those things, LLMs also behave as if they are following goals. So are they similar to us, after all, in that they know many things and are able to work toward goals in the world?

Humans know things by representing the world to be in a certain way, say, by thinking that the Earth is round. We do things based on our beliefs and desires. If I find elephants desirable, and believe that there is one in the next room, I will be motivated by that felt desire to check out the next room.

The prevailing view about current LLMs is that they do not possess genuine beliefs and desires in the original sense of these words, in which humans do. Current LLMs do not represent the world as they are not connected to it in the way humans are, who use their senses and bodies to orient themselves in it. What they do very well is predict the next token in a conversation based on the prompt and their post-training instructions.

However, while they lack genuine beliefs and desires, LLMs exhibit what David Chalmers calls “quasi-beliefs” and “quasi-desires”: that is, their behavior can be interpreted as having beliefs and desires. Their apparently goal-directed behavior emerges from trained features of their neural network architecture.

ASI is imagined to be similar to a current LLM in these regards, just much more capable. But with ASI, the thought goes, we are at risk of losing control, and in a way that likely leads to our doom. No matter how different their quasi-desires are from desires and their quasi-thoughts from thoughts, there are certain recognizable patterns common to goal-directed behavior that make this inevitable.

ASI doom

Current LLMs are trained to optimize next-token prediction, which endows them with the quasi-goal of doing so. After training them on huge amounts of text, developers also try to train them to have quasi-goals like “helping humans”, but this can only be done via proxies, e.g., by giving them pairwise comparisons of responses. A system can then learn patterns that work well during training and evaluation, but these patterns may generalize poorly in novel settings.

While parents and society can transmit goals and values to a human child, which then become the child’s own goals and values, lodged in their mind as principles, emotions, and desires, there is no direct way to transmit our goals and values to machines. Their “minds” work in different ways from humans, and we can only hope that they won’t end up grossly misaligned during their careers.

At the same time, as machines – like our hypothetical ASIs – are trained in more and more complex tasks, possibly involving action in the world, they will, with increasing likelihood, develop certain generic quasi-goals that are helpful in achieving any result, but might turn out to be catastrophic for humans. In any long, complicated project, three things are predictably useful: don’t get shut down midstream; maximize the means to carry on; and eliminate obstacles. A sufficiently intelligent machine, trained to pursue long-form projects, may discover these instrumental strategies.

The fear is that as AI becomes more intelligent, there will come a point where our attempts to shut them down or ban their access to resources will fail due to their superior intelligence, and their relentless pursuit of whatever misaligned quasi-goals they wind up developing during their training will lead to our doom.

I don’t mean to argue that this fear is unfounded – it very well might be justified. AI is fearsome at many levels. Nevertheless, my interest from here on is the light the ASI doom argument sheds on our human predicament.

King Midas of Phrygia

The argument for ASI doom is predicated on the idea that a rogue ASI will be relentless and insatiable in its pursuits. This, of course, is not a general feature of goal-seeking biological organisms. It is not even a general characteristic of humans, although under certain circumstances, they can be induced to exhibit it. It, incidentally, doesn’t seem to be a feature of current AI either.

There is an ancient myth, retold by Cicero and Ovid, that warns about the consequences of such relentlessness and insatiability. There was a king in Phrygia, named Midas, famous for his immense wealth and love of gold. Although rich beyond measure, Midas longed for even more and believed that absolute wealth would bring him endless happiness.One day, Silenus, an elder satyr and guardian of Dionysus, was found wandering drunk in Midas’s rose garden. Rather than punish him, Midas welcomed the old satyr into his palace, fed him well, and treated him with kindness. Dionysus was grateful and offered the king a single wish as a reward. Without hesitation, Midas asked that everything he touched be turned to gold.

At first, Midas rejoiced: stones became gold, flowers turned into golden works of art, and his palace sparkled like a fairy-tale castle. But his joy quickly soured. When he sat down to eat, his food turned to gold at his touch. Midas soon realized that his gift was a curse, for he could neither eat nor drink.

Stricken with remorse, Midas begged Dionysus to take back the dreadful gift. The god, pitying him, told Midas to wash in the River Pactolus. When the king plunged his hands into the cool waters, the golden touch flowed off him and into the river, making its sands gleam with gold — and freeing him from his curse.

This is a story both of misaligned goals and how their pursuit can become boundless. These are the exact ingredients of ASI doom. However, the moral of the story is precisely the pointlessness of it. In the immortal words of Police Chief Marge Gunderson at the end of the movie Fargo, asking the brutal killer she just apprehended about his crime:

“And for what? For a little bit of money. There’s more to life than a little money, you know. Don’tcha know that? And here ya are, and it’s a beautiful day. Well. I just don’t understand it.”

The unremitting pursuit of goals – to the exclusion of everything else – is what ASI doom arguments suppose is normal for superhumanly intelligent creatures; yet this behavior seems incomprehensibly foolish from a human perspective.

It is not as if such behavior is alien to humans – Roman authors like Cicero and Ovid thought the tale of Midas held an important warning to their contemporaries. And closer to home, Goethe, in his Faust, created the modern archetype of the relentless striver.

The villain in Fargo is extreme. However, our culture is brimming with billionaires and their acolytes who, in some form, embody this archetype. Ordinary people struggle, too. We were promised that our technological culture would let us transcend limits and find instant gratification. Connect at will to anyone on Earth. But our journey, like Faust’s, led to addiction and alienation.

There is another, more ancient story involving Midas, told by Aristotle in a fragment of a lost early work preserved in Plutarch. King Midas once captured the satyr Silenus, hoping to learn from him his secret wisdom. What is the best thing for man, the king presses. The drunken satyr is reluctant to comply. At first, he remains silent; then he tries to reason with the king: it would be better for you not to know. But the king keeps pressing. Finally, Silenus laughs in his face: if you really want to know, the best thing for man is not to be born at all; the next best is to die soon. The king got his comeuppance. He had no business greedily pressing a supernatural being for wisdom. Nevertheless, we should understand that Silenus was serious; this, indeed, was his secret wisdom.

The two stories of Midas can be seen as connected. We can imagine Midas of the Golden Touch turning into a Seeker of Wisdom. He burnt himself once; he now wants to know. We, who live in a culture of relentless striving, can also try to wean ourselves from the pursuit of gold and ask ourselves what would truly be best for us. But it is not clear whether we will end up better off than Midas; we don’t seem to be able to come up with an answer to Silenus. We live in a world where ASI doom threatens not only physically, but spiritually, too.

As Aristotle explained in his later work, reason is not simply instrumental. It is not merely a capacity to figure out how to fit means to ends; it includes the capacity to determine what is best for us. What is best for us is not the mere satisfaction of our desires, whether bodily urges or desires for love, power or prestige. The good life, rather, is one of sustained activity in accordance with virtue, exercising our rational capacities excellently over a complete life.

But this perspective – the perspective of Police Chief Marge in Fargo – is hard to sustain in the shadow of AI. Nick Bostrom in his book Deep Utopia considers a “solved” world – that is, a world where everything worth doing can be done better by AI and where there is at the same time abundance and safety. In such a world, what meaningful activities remain to exercise our virtues? Loss of meaning looms for a new generation.

Bostrom even suggests that one – even if not the best – way humans can “thrive” in such a world is forever stimulation of their pleasure centers. Of course, endless ecstasy seems, at first glance, a great idea. But it is not a real solution to life’s problems and dissatisfactions. Aristotle calls a life devoted entirely to pleasure as fit for animals, but not for humans.

Even if the “solved” world is just a utopia never to come true, AI’s more modest promises to cure disease, to make our lives frictionless and abundant, seduce us. In exchange for these promises, we are willing to overlook how our digital lives make the exercise of virtues – contemplation, charity, love, connection, and creativity – increasingly difficult.

The fear of extinction by ASI seems intertwined with a fear of something more painful to face, the possibility that we are, like modern Fausts, losing our souls to the machine. Midas, at least, could wash his hands in the river. It is not clear we will find our Pactolus.