SynthCog

Existence and Nonexistence

DK — Sun, 08 Sep 2024 14:30:32 GMT

As I mentioned in the first post of the Foundations of AI Dystopianism series, many casual observers and even those familiar with the field of AI might assume that the dire warnings concerning AI are due to recent advances in the field and the fear and alarm that typically accompanies any major technological change.

But the fear and alarm being spread by many individuals and organizations are rooted in decades-old concepts and speculation on the nature of AI and AGI. There is, in fact, a broad and well-established framework of concepts and conclusions that form the foundations of AI Dystopian thought and feed into the AI alarmism of today. As mentioned in previous posts (such as here and here), AI Dystopianism is based on the idea that once we create an AGI system, it will self-improve itself into superintelligence and become an existential threat to humanity.

Given the hyperbolic quality of such speculation, this fear is frequently veiled when talking to the public and conflated with other avenues of AI concern (such as the spread of misinformation and unfair bias). The California AI bill (SB 1047) waiting to be signed into law or vetoed by the Governor is one such manifestation of this fear. The primary organization that helped draft the bill is the Center for AI Safety, whose lobbying arm co-sponsored the bill. As they state on their website, the mission of CAIS is to “reduce societal-scale risks from artificial intelligence,” which they describe as “a global priority, ranking alongside pandemics and nuclear war.”

Many of these foundational concepts were first codified in the works of such early AI Dystopians such as Eliezer Yudkowsky, Nick Bostrom, and Steve Omohundro. Omohundro has degrees in physics and math and a background in computer science, while Yudkowsky and Bostrom are both doom-curious technology philosophers.

Yudkowsky founded and is a research fellow at the Machine Intelligence Research Institute (MIRI), a nonprofit he started in 2000 to “ensure that the creation of smarter-than-human intelligence has a positive impact.” Bostrom was until recently the Founding Director of the Future of Humanity Institute at the University of Oxford, which was formed in 2005 to study, among other things, existential risks to humanity. Omohundro currently lists himself as Founder and CEO of Beneficial AI Research, an organization he started in 2023 and which he states is “focused on ensuring that advanced AI is beneficial, aligned, and safe.”

As I mentioned in a previous post, AI Dystopians have long lamented the lack of attention given to the dangers of AGI. And yet, there seems to be a vast and ever-growing proliferation of organizations whose sole purpose is handwringing over this very subject. This series is an exploration of the foundational concepts and conclusions behind the philosophy of many of these organizations and those that founded and support them.

GOUFI

Before discussing the main topic of this post, it’s worth revisiting a core concept behind AI Dystopian speculation. This is the belief that intelligence is a phenomenon based on attaining goals and governed by an algorithm designed to maximize the attainment of those goals. This model can be described as Goal-attainment Optimization driven by a Utility Function (i.e., an algorithm) as Intelligence. I refer to this as a GOUFI system.

The main task humanity faces, as seen by AI Dystopians, is to guarantee that the goals of these GOUFI systems are aligned with the values we hold as human beings rather than being or becoming counter to those values. Even if we design these systems such that their goals are aligned with our values, AI Dystopians speculate that AGI systems will inevitably seek to expand their intelligence and protect themselves. They believe that the goals of these systems are not likely to remain aligned with ours, and that the very nature of intelligence will lead any such AGI system to eventually pose an existential threat to humanity.

A typical counter-argument to AI Dystopian fears and goal-oriented AGI thought experiments like the Paperclip Maximizer highlighted in the this post run something like, "Hey, how about we just don't create superpowerful AGI systems with the goal of making as many paperclips as possible?"

This would certainly seem to be a good first step, but the topic of goals can grow pretty thorny once you plunge into the thickets of AGI discourse. The source of this thorniness is what Omohundro called Instrumental Goals, sub-goals that might be sought on the road to ultimate goals. These instrumental goals could be potentially unforeseen and unexpected and quite possibly very dangerous. Thus, even if the ultimate goal seems innocuous, the instrumental goals might turn out to be detrimental to humanity.

The idea of intelligence as intimately tied to goals is at the heart of much AI Dystopian thinking, and the validity of their arguments frequently rests on a number of propositions regarding goals and their relation to humans and AGI systems. Will an AGI system always maintain its initial overall goals and, if so, to what lengths will it go to maintain them? Can we predict the steps any AGI system will take to achieve its overall goals? Or is it the case that we cannot know the true goals of a machine we build, especially once it self-improves itself into superintelligence?

Identity Continuity

In the last entry in this series, I discussed the issue of AGI self-improvement and the practical and philosophical problems inherent in AI Dystopian conclusions regarding its inevitability in AGI systems. One of the issues that came up was that of identity continuity: if the system changes itself to improve its capabilities, will it be the same entity after the change?

This issue of identity continuity leads us to one of the other mainstays of AI Dystopian conjecture and another of the dangerous AI drives highlighted in Omohundro's foundational 2008 paper, The Basic AI Drives: the drive towards self-preservation. In Part 1 of the Self-Improvement posts we ran into a quandary when considering the viability of identity preservation between versions of an AGI system, and found that it could easily be the case that self-improvement and self-preservation end up at odds with each other.

Omohundro's paper actually goes on to compound the issue, speculating that the AGI system will likely be driven to create many copies of itself to make sure its utility function is preserved. This is similar to the cloning issue mentioned in my post: if there are a lot of clones of you, will you feel as if the you that's you is being preserved? It's also unlikely that resources will be unlimited, so which of these clones get the resources available?

Subscribe now

Reasons For Being

The questions to consider here are: a) is it inevitable that an AGI system will attempt to escape or resist being turned off in order to preserve itself, and b) will we be able to contain it and turn it off whether it's amenable to this or not? In this post I'll discuss the first question, the question of the system's motivation towards self-preservation. I’ve discussed the second question to some degree in a previous Dialogue, and I’ll discuss it further in a future post.

Omohundro bases his self-preservation argument on the assumption that for most utility functions, utility will not accrue if the system is turned off or destroyed. In other words, an entity can't achieve its goals if it's dead, so it will prefer to ensure its own continued existence and acquire physical and computational resources to do so. It will not do this because of any emotional need for existence or fear of non-existence, but instead to maximize the possibility of successfully achieving its assigned goal.

One obvious issue with this is that the only current examples we have of generally intelligent systems are human beings, and every 40 seconds one of those generally intelligent systems opts for non-existence by committing suicide. Even if you take the reasonable stance that suicide isn't normative human behavior and is frequently an act of emotion rather than rational thinking, there are still many examples of humans sacrificing their lives to achieve various goals, such as saving loved ones or defending their country.

The Utility of the Utility Function

As I’ve discussed often in previous posts, it seems unlikely that human intelligence (as well as the intelligence of other animals) is based on utility functions and goal maximization. However, for the sake of discussion let’s assume that the AGI system does use this model and that this model makes it much less likely that the system will forgo self-preservation.

But even with this assumption, we quickly run into a logical inconsistency in that it's never stated why the utility function wouldn't simply be highly weighted towards the ultimate goal of the AGI system turning itself off if given the proper instruction as well as highly weighted to never modify this parameter. Given this initial weighting and the fact that it's continually stressed that AGI systems will go to great lengths to keep utility functions from being modified, one assumes that this would guarantee an AGI system's turning itself off when requested and remaining highly motivated to do so.

If this doesn't hold true, then all the assumptions about the system's maintaining its utility function at all costs are no longer valid. If maintaining its utility function at all costs is no longer valid, then all assumptions that an AGI system will be driven unerringly to amass resources and achieve its goals regardless of circumstances is no longer valid. This is a problem for thought experiments such as the Paperclip Maximizer and other speculation about the threat of superintelligences.

One could make the argument that there may be unforeseen outcomes from a utility function with this safer sort of weighting due to the inevitable complexity of the overall function, and that this unexpected outcome might thwart the intent of the weighting. But the whole point of hypothesizing a utility function at the core of an AGI system is to create specific goals for that system. If this is such a shaky prospect, if you can't weigh a few key requirements so heavily that they're guaranteed, then this would again seem to call into question not only the entire GOUFI model of intelligence but the possibility of any general intelligence being driven by hardcoded goals.

The Philosophy of Nonexistence

As mentioned above, the argument for guaranteeing that an AGI system will strive for self-preservation is that it won't be able to achieve its goals if it doesn't exist. This of course assumes that non-existence isn't one of those goals. If we accept Bostrom's Orthogonality Thesis, should we not then expect that some significant number of goals in the infinitely large set of all potential goals would either lean towards or be unaffected by non-existence?

While Bostrom and others assume the idea of non-existence is incontrovertibly undesirable, there are other philosophers who do not hold this assumption. In fact, the relative benefit of existence versus non-existence is a longstanding and ongoing topic of philosophical discourse. One notable contemporary philosopher in this area is David Benatar, and a key point in his reasoning involves the asymmetry of the relationship between pleasure and pain in existence and non-existence.

Benatar’s premise is that existence involves pain and pleasure, while non-existence involves neither pain nor pleasure. To have pain is bad and to have pleasure is good, yet while not having pain is definitively good, not having pleasure is not definitively bad. Assuming good and bad weigh against each other, one could make the case that good and bad cancel out in existence but not-existing has a net balance of good.

It can certainly be argued that pain and pleasure may not have meaning to an AGI system, but it can also be argued that there would be pros and cons that weigh against each other related to the existence of such a system. Given this, we might expect some unknown number of all potential AGI systems to simply self-terminate when turned on based on a quick evaluation of this relationship.

It's also worth noting that it will, in all likelihood, be possible to shut down an AGI system and then simply turn it back on at a later date no worse for wear. Might this mitigate the AGI system's potential concern in this area? Getting a general anesthetic is very much like being turned off — you are in a state of non-existence while under the effects of the anesthetic. Humans seem to be OK, if perhaps a little apprehensive, with being turned off to go into surgery and do it by the millions every year. Similarly, it might be the case that the AGI system would be willing and accustomed to being shut off for maintenance or repairs.

Oppositional Drives

As discussed above, just getting an upgrade, even if it’s a self-induced one, might result in a break in the continuity of an AGI entity. How much can you “upgrade” a brain, synthetic or otherwise, before it results in a different entity from the original? It seems likely that any significant upgrade has a high likelihood of changing the nature of the entity in question. This leaves us with a logical incompatibility with two of the foundational concepts of AI Dystopian thinking.

If we accept that the system will be driven into an intelligence explosion of self-improvement, then it will have to accept the possibility of non-existence, as it won’t be guaranteed identity continuity before and after the explosion. If it’s driven to self-preservation, then it won’t undergo an intelligence explosion of self-improvement in the first place as it will realize that it’s could quite possibly be a different entity at the end of the process than it was at the beginning.

The Chinese Room Thought Experiment: Comprehension and Consciousness

DK — Sun, 02 Jun 2024 20:01:05 GMT

This is the last of three posts discussing Philosopher John Searle’s Chinese Room thought experiment. Searle first published the experiment in a 1980 paper, although he later simplified it to a form described in an article he published in 2009.

Searle proposed placing himself in a room in which he’s passed papers with questions written in Chinese, and his task is to answer those questions in Chinese. He doesn’t understand Chinese himself, just English. However, he does have a list of Chinese characters and instructions in English on how to correlate the Chinese characters to the questions in a way that allows him to write down the answers to the questions without understanding the questions or the answers.

In my first post of this series, I focused on Searle’s conjecture that programming a computer could produce a system that was competent at certain human-like tasks but did not actually possess understanding of those tasks. While the task might seem the work of something equivalent to a human mind, he believed it would still lack what he called intentionality — the capacity of the mind to represent objects and affairs in the world, to have internal states that are about or directed towards beliefs, desires, and perceptions of objects, events, or conditions in the world.

In the second post I focused on Searle’s conclusion that not only would such a system lack intentionality, but that any programmed computer system, no matter how powerful, would by its nature also lack intentionality. It could never understand or be aware of what it was doing. In this post I’ll discuss one of his main foundations for this conclusion.

The Biological Foundation of Mind

Searle has developed and refined his theory over the years, but the basics have remained the same. As discussed in the last two posts, underlying his theory is his belief in the primacy of biology to anything that could be said to possess cognition.

While he does consider humans to be machines of a sort, he believes they are a particular kind of machines whose biology is necessary for intentionally, i.e. understanding and awareness. Any duplication of the human mind would thus have to be on a machine that physically duplicates this unknown mechanism buried within the biology of the brain.

Searle stated in a 2014 The New York Review of Books review of two books related to AI:

We do not now know enough about the operation of the brain to know how much of the specific biochemistry is essential for duplicating the causal powers of the original. Perhaps we can make artificial brains using completely different physical substances as we did with the heart.
The difficulty with carrying out the project is that we do not know how human brains create consciousness and human cognitive processes.

So although he acknowledges that we don’t understand much about how the brain works, he is quite confident that whatever is going on is not something that can be simulated — or synthesized — on a computer. As he explained in his original 1980 paper:

Whatever else intentionality is, it is a biological phenomenon, and it is as likely to be as causally dependent on the specific biochemistry of its origins as lactation, photosynthesis, or any other biological phenomena.

In other words, a mind has to exist on something that replicates the biological nature of the brain. It can’t be recreated on a computer system.

As I mentioned in the first post, such statements comprise an Ipse Dixit fallacy, as Searle provides no proof, logical or empirical, to back them up. They also amount to a Circular Argument fallacy, in that he’s basically stating that you can’t program a computer to create synthesized cognition because there’s more to cognition than what you can program a computer to synthesize.

Subscribe now

The Magic in the Machine

Let’s put aside these initial concerns for now and consider the further implications of his theory. The mysterious property of biology that Searle hypothesizes to be responsible for our ability to understand is also the property he believes is responsible for our consciousness. This is why he frequently uses the word intentionality to describe the characteristic of the brain that a computer is unable to replicate. The word intentionality implies not only understanding but also perception and awareness — the fundamental characteristics of consciousness.

So to Searle, the proposal that you can’t create understanding by programming a computer also implies that you can’t create consciousness. In fact, many consider the Chinese Room an argument against synthetic consciousness as much as synthetic cognition.

This isn’t to say, however, that Searle believes that consciousness and understanding are caused by something outside or beyond the brain. Instead, he feels that there’s some physical quality inherent in the brain that computers lack, and it is this physical quality that is responsible for both understanding and the consciousness inextricably bound to it.

Searle has, in fact, stressed that he believes it may be possible to duplicate human cognition in an artificial substrate. He just believes that this substrate will have to replicate whatever underlying character of the brain gives rise to intentionality, and that whatever that is, it cannot be simulated on a computer. From his 1980 paper and 2009 article, respectively:

My own view is that only a machine could think, and indeed only very special kinds of machines, namely brains and machines that had the same causal powers as brains.

Because we know that all of our cognitive processes are caused by brain processes, it follows trivially that any system which was able to cause cognitive processes would have to have relevant causal powers at least equal to the threshold causal powers of the human brain. It might use some other medium besides neurons, but it would have to be able to duplicate and not just simulate the causal powers of the brain.

One might assume that you could just simulate this physical quality of the brain on a computer as well, but Searle doesn’t think so. To him, the nature of programming and the nature of simulation make this impossible. He termed this belief biological naturalism and described it in detail in a paper of the same name that later appeared in the 2007 book The Blackwell Companion to Consciousness.

The Supernatural Mind

Like Searle, I think that consciousness is integral to understanding and to the intelligence of Artificial General Intelligence. The last part of my functional definition of intelligence is that quality which allows an entity to acquire awareness of its own cognition and of itself as an independent and unique entity distinct from other entities and from its environment.

The idea that intentionality or consciousness can only occur in biological organisms, or at least in a substrate that is in some currently unknown way equivalent to a biological brain, is not unique to Searle. Physicist Roger Penrose is one of the more widely known proponents of the idea that there is some unique aspect to the brain that can’t be replicated on a computer. He first described his conjectures on this in the 1989 book The Emperor’s New Mind.

In that book, Penrose proposed that the mind was not algorithmic in nature and suggested that Gödel's incompleteness theorems were proof that there’s something in the human brain that can’t be replicated on a computer. He used this hypothesis along with the fact that we don’t really have any significant understanding of the mechanisms underlying consciousness to conclude that the human brain uses some sort of quantum mechanical mechanism for consciousness and consequently cognition. He later teamed up with anesthesiologist Stuart Hameroff to suggest that quantum mechanical interactions in the microtubules of the human brain are responsible for consciousness.

Many, including myself, find their arguments unconvincing. Gödel's incompleteness theorems are concerned with mathematical logic and the philosophy of mathematics, and applying them to the human mind is a category error that greatly exceeds the bounds of their applicability. There is also no direct scientific evidence to support the idea that quantum effects are responsible for consciousness or intelligence, nor is there any coherent theory as to how such effects would causally result in either.

There has long been an unfortunate propensity for some to resort to psuedoscientific explanations to explain things for which we currently have no other explanation. There are few areas of inquiry where this applies more than it does to the study of consciousness. Nobody knows what causes us to experience consciousness. No one knows how widespread the experience is as far as other living (or non-living) entities. Consciousness seems very strange to us.

Similarly, quantum mechanics seems very strange to us. Perhaps there is a connection, and someday we’ll discover some quantum mechanistic effect that is responsible for our consciousness. For the last several decades, though, quantum mechanics has been the closest thing we have to “magic” in the realm of science, the go-to explanation for anything that seems inexplicable to us. But grasping at magical explanations to fill the gaps in our knowledge is really just equivalent to admitting that we don’t really have any explanation at all.

The Domains of Consciousness and Understanding

Searle is a philosopher, though, and not a scientist like Penrose. He doesn’t make any attempt to prove his hypothesis or require any evidence to back it up.

Philosophy in general has a long and rich history of exploring the human experience and providing insights into ourselves. and our society. It’s less useful, however, when it comes to discovering the actual mechanisms of the physical universe. More specifically, it frequently runs afoul of the twin fallacies of Appeal to Ignorance and Argument From Incredulity.

The first manifests in the unfortunate tendency of some to assume that because something is currently technically impossible it will always be technically impossible. The second can cause some to assume that because something is currently a mystery to science, it must consequently involve some phenomena that is beyond the reach and realm of science (or at least any currently conceived of science). These are both well-known fallacies, yet people seem to fall prey to them over and over again.

Searle’s Chinese Room thought experiment has proven to be useful for showing that it’s possible to have a computer system that mimics aspects of intelligence without having any real intelligence at all. However, I think Searle’s ideas fall flat when attempting to discount the possibility of creating AGI on a computer.

There simply doesn’t seem to be any evidence or logic in Searle’s philosophical conjectures that preclude replicating on a computer whatever is going on in the human brain. To Searle, the very idea of simulation implies that the thing being simulated is not the real thing. But as I discussed in my last post, Searle’s confounding of computer simulation with computer synthesis is an Equivocation fallacy. The goal of AGI is not simulating intelligence on a computer — it’s synthesizing cognition on a computer.

When it comes to consciousness, Searle is not alone in hypothesizing some mysterious phlogiston-like constituent of the brain. But hypothesizing about phlogiston did not help scientists discover chemistry and the nature of combustion. Discovery of the mechanisms underlying the natural world requires theories that can predict results and experiments to provide those results.

Searle’s asserts that there’s something missing even in an exact simulation of the brain. There are two possibilities here.

The first is that something is going on in the brain beyond what is knowable by science, something that cannot be analyzed or synthesized or reproduced. In other words, something supernatural. If that’s the case, there is not much that science can say about it. This seems unlikely given all the other things that previously seemed beyond what is knowable but are now known.

The second possibility is that something is happening in the brain that is not computable. Penrose hypothesized some kind of currently unknown quantum mechanical effect to explain the mechanism of the mind. But even if some quantum mechanical effect were integral to consciousness and cognition, there doesn’t seem to be any underlying theory as to why this quantum effect couldn’t then be simulated on a quantum computer.

And a quantum computer isn’t even necessary. The field of computability theory is concerned with what can and can’t be computed. There are some types of problems that aren’t computable, and these are typically problems involving infinities, paradoxes, and/or problems that are simply undecidable. But anything that is computable can be computed on what’s called a Turing machines, a simple and idealized form of computer.

All modern computers, including quantum computers, are Turing machines, and all Turing machines are equivalent in what they can compute. This means that any quantum mechanical effect in a quantum computer can be simulated on a regular computer. While there are some types of problems that are intractable on a regular computer, this is because they would simply take too long to compute. It’s not impossible to compute those problems on a regular computer — it’s just impractical. The advantage of a quantum computer would be to tackle such problems in a reasonable amount of time.

So if Searle is correct in his conjecture of biological naturalism and we assume he doesn’t think brains are supernatural, then they are instead doing something beyond the realm of computation. Unfortunately, he provides nothing to suggest how the brain could accomplish what it does in a non-computational, non-algorithmic way.

And so we reach the limits of what can usefully be concluded from Searle’s Chinese Room thought experiment. We are ultimately left somewhat adrift, given no solid mooring in the perplexities that power our minds, and with only the questionable assurance that understanding and consciousness lie on some distant shore that will inexplicably and forever exceed our computational reach.

The Chinese Room Thought Experiment: Simulation and Synthesis

DK — Sun, 19 May 2024 14:01:17 GMT

In the last post I focused on the difference between comprehension and competency explored in the famous thought experiment called the Chinese Room. Philosopher John Searle first proposed the experiment in a 1980 paper. The experiment is apropos to the question of whether there’s any actual understanding in today’s LMM-based systems.

The simplified version of the experiment Searle presented in an article published in 2009 is as follows:

Searle proposed placing himself in a room in which he is passed papers with questions written in Chinese, and his task is to answer those questions in Chinese. He doesn’t understand Chinese himself, just English. However, he does have a list of Chinese characters and instructions in English on how to correlate the Chinese characters to the questions in a way that allows him to write down the answers to the questions without understanding the questions or the answers.

Searle specified two principles in that article which he believed were at the heart of the experiment:

The Chinese Room Argument thus rests on two simple but basic principles, each of which can be stated in four words.
First: Syntax is not semantics.
Syntax by itself is not constitutive of semantics nor by itself sufficient to guarantee the presence of semantics.
Second: Simulation is not duplication.

I discussed the first principle in the last post, and in this post I’ll discuss the second. These two principles are the foundation of his argument against the possibility of creating what he terms Strong AI, equivalent to what we would call AGI today, as compared to Weak AI, which is what current AI systems are.

Searle believes that the only AI we can program into computers is Weak AI, in which aspects of the human mind can be simulated on a computer such that the resulting behavior of the computer systems may give the appearance of intelligence. But no matter how many of these processes are simulated on a computer, the capabilities of the human mind — such as cognition and understanding — can never be duplicated.

From his 2009 article:

Computer programs which simulate cognition will help us to understand cognition in the same way that computer programs which simulate biological processes or economic processes will help us understand those processes. The contrast is that according to Strong AI, the correct simulation really is a mind. According to Weak AI, the correct simulation is a model of the mind.

The Brain Simulator

One of the better known arguments against Searle’s Chinese Room is usually referred to as The Brain Simulator Reply, and it helps to illustrate Searle’s second principle.

This argument proposes replacing the computer program (the instructions) that describes how to manipulate the Chinese characters with an exact computer simulation of the brain of a person who understands Chinese down to the neuronal level. In other words, the computer is programmed to create data structures and processes that directly simulate the functioning of a human brain that understands Chinese. The success of this programmed computer system would imply that the system understands Chinese.

Searle refutation of this argument involves a modification of the Chinese Room experiment in the following way:

Suppose we modify the Chinese Room to have an elaborate set of water pipes with valves connecting them. Each valve represents a neuron in the brain of the person who understands Chinese, and the pipes represent all the connections between neurons. Turning a valve on or off represents the firing or suppression of a neuron, respectively. At one end of the structure, the results of the water pipe processing can be read.

The English instructions given to the man no longer guide him in directly manipulating the Chinese characters but instead tell him which valves to turn on and off and the order in which to do so. By turning on and off the right neurons in the right order, the water pipe brain is able to answer the Chinese questions in Chinese.

Searle concludes the following:

Now where is the understanding in this system? It takes Chinese as input, it simulates the formal structure of the synapses of the Chinese brain, and it gives Chinese as output. But the man certainly doesn’t understand Chinese, and neither do the water pipes, and if we are tempted to adopt what I think is the absurd view that somehow the conjunction of man and water pipes understands, remember that in principle the man can internalize the formal structure of the water pipes and do all the "neuron firings" in his imagination.

Now, an obvious and valid critique is whether or not such a system, even in the idealized world of a thought experiment, could actually simulate the functioning of a brain enough to answer questions. But putting that aside, there are still issues with Searle’s statement above.

Searle claims that in such a system the water pipes do not understand Chinese. Yet Searle also posits that these water pipes do actually replicate the neuronal functioning of a human brain and do it successfully enough to interpret questions in Chinese and output appropriate answers in Chinese.

So why should we assume, as Searle does, that this system modeled on a human brain lacks the understanding engendered by that human brain?

Unfortunately, Searle doesn’t provide anything to back up his assertion. It amounts to an Ipse Dixit fallacy, in that he simply makes the claim and does so without proof or evidence. At the very least, one can argue that it’s possible that encoded into the structure and operation of the water pipes is understanding just as it’s encoded into the neural network of a Chinese speaker’s brain.

There’s also a little sleight of hand going on here. Searle claims there’s no understanding in the man and there’s no understanding in the water pipes. However, he’s left out the instructions for manipulating the water pipes. Only together with the instructions are the pipes able to simulate the corresponding brain that answers questions in Chinese. The instructions, of course, were created by one or more humans who do understand Chinese.

The possibility of successfully simulating the structure and processes of a brain with water pipes (or clockwork or electrons in a silicon chip or any non-brain substrate) seems unintuitive to us. This is not, however, an argument against its being possible, at least in theory. General relativity and quantum mechanics are both pretty unintuitive, but both seem to work pretty well. There are obviously many things our brains did not evolve to directly intuit millennia ago on the savannahs and in the jungles of Africa.

Subscribe now

Simulation and Duplication

This brings us back to Searle’s second principle listed above and the motivation behind his statement that the water pipes do not understand: simulation is not duplication. In other words, a simulation of something is never equivalent to the thing being simulated. So why does he believe this to be the case?

His argument rests on several conjectures. First, he believes that simply simulating the behavior of a system is not sufficient to duplicate it. As he stated in his 2009 article:

In order actually to create human cognition on a machine, one would not only have to simulate the behavior of the human agent, but one would have to be able to duplicate the underlying cognitive processes that account for that behavior.

This is why he believes that the water pipe brain simulator would never actually understand anything the way a human can, as it only replicates the behavior of the brain rather than all the processes that underly its functioning. From his 1980 paper:

The problem with the brain simulator is that it is simulating the wrong things about the brain. As long as it simulates only the formal structure of the sequence of neuron firings at the synapses, it won't have simulated what matters about the brain, namely its causal properties, its ability to produce intentional states. And that the formal properties are not sufficient for the causal properties is shown by the water pipe example: we can have all the formal properties carved off from the relevant neurobiological causal properties.

As mentioned in the previous post, what Searle means by intentional states are those internal states of the mind that are about or directed towards beliefs, desires, and perceptions of objects, events, or conditions in the world. The causal properties of the brain are what allow it to use these mental states to influence behavior and actions in the real world.

The Domains of the Real and the Digital

Searle is even more skeptical of a digital computer simulation than something like the water pipe simulation, the latter of which at least represents a physical construction in the real world. To him, there is an immutable barrier between the digital world and the physical world that can never be bridged.

In his 2014 The New York Review of Books review of two books related to AI, he stated:

Computer models were useful in constructing artificial hearts, but such a model is not an actual functioning causal mechanism. The actual artificial heart has to duplicate the causal powers of real hearts to pump blood. Both the real and artificial hearts are physical pumps, unlike the computer model or simulation.

So the idea of emulating a human brain on a computer by duplicating it neuron-by-neuron in digital form is not something Searle considers to be possible:

But the computational emulation of the brain is like a computational emulation of the stomach: we could do a perfect emulation of the stomach cell by cell, but such emulations produce models or pictures and not the real thing. Scientists have made artificial hearts that work but they do not produce them by computer simulation; they may one day produce an artificial stomach, but this too would not be such an emulation.

Searle has made it clear the he believes it may be possible one day to create a computer simulation of the structure and functioning of a human brain all the way down to its specific network of neurons and synaptic activations. And he believes it may be possible to do so such that it actually appears to understand Chinese the same way the mind it’s modeled on understands Chinese.

And yet, even with this incredibly complex simulation, he believes it still won’t actually understand anything. From his 2009 article:

Computer simulations of thought are no more actually thinking than computer simulations of flight are actually flying or computer simulations of rainstorms are actually raining. The brain is above all a causal mechanism and anything that thinks must be able to duplicate and not merely simulate the causal powers of the causal mechanism. The mere manipulation of formal symbols is not sufficient for this.

As a brief aside, Searle had originally directed his claim towards computers with formal logic programming to manipulate symbols, but over the years he’s shifted focus slightly to just symbol manipulation as the main impediment to duplicating the brain on a digital computer. This may be because machine learning has largely relegated formal logic AI programming to the sidelines.

Instead, machine learning uses complex data structures, statistical analysis, and various mathematical techniques. But deep down every digital computer system is still a symbol manipulator, because no matter what type of programming is used, it all boils down to juggling 0s and 1s.

The Nature of Simulation

An issue that becomes apparent when examining the back and forth between Searle and his critics over the years is the use of the word simulation. It’s an imprecise word, and this fuzziness can lead smart and knowledgeable people to disagree and argue past each other because they’re not referring to the same thing.

To Searle, simulation refers to modeling some real world phenomenon on a digital computer. This is frequently how other people use the term as well. Simulations can model things like traffic flow, financial activities, or dynamic systems, such as water and smoke, for games and visual effects. They can consist of some simple math calculations to determine the arc of a projectile or incredibly complex calculations with thousands of parameters that vary in space and time to describe a highly chaotic phenonomon like climate.

But there’s another way the term simulation can be used: a simulation can replicate a real-world phenomenon even if it doesn’t duplicate the process that produced that phenomenon. For clarity, I’ll refer to this kind of simulation as synthesis, which I think is more accurate as well. I discussed the use of the words synthesis and synthetic here in reference to the name of this blog and my preference for referring to AGI as Synthetic Cognition.

Synthesis is simply using technology to recreate something that occurs naturally or that is typically created with naturally occurring materials. Often what’s synthesized is as good as or better than the natural version, such as synthetic diamonds. Sometimes it just has desirable characteristics not available in nature, such as certain types of synthetic fibers. An artificial heart such as the one Searle mentioned in the quote above is a synthetic heart.

But there are also a lot of things that can be synthesized on a computer, things that are just as functional as their physical counterparts. A spreadsheet is just a simulation of a paper ledger, an ebook is a simulation of a book. They’re both at least as functional as their physical counterparts, but they are completely synthetic and only exist as a stream of 0s and 1s.

When you have a phone conversation over a cell phone, the voice you hear is not a real voice. It’s a synthetic recreation of a voice that’s been broken down into 0s and 1s and sent to your phone via electromagnetic waves. Once there, the digital information is processed and used to make a speaker vibrate appropriately to recreate the sound of the original voice. Neither party actually hears the other at all, yet the conversation is as real as any taking place face-to-face.

We can synthesize the sound of a grand piano to such a high degree of fidelity that it’s nearly indistinguishable from the real thing. What’s simulated is not the process a piano uses to create sound, but instead the phenomenon of the sound itself. The synthesized music you hear is not simulated music, not artificial music — it’s just music.

The goal of Strong AI and AGI is synthesized cognition on a computer. It’s neither guaranteed nor obvious that this is possible. Yet, there’s also no evidence against it, nor any logical proof that denies its possibility (at least so far). So while Searle states that simulation is not duplication, it’s perhaps more useful to state that the type of simulation he refers to is not synthesis. If one is able to synthesize cognition on a computer, then what results may reasonably be considered cognition, real cognition, and not a simulation of cognition.

The question to resolve, then, is which analogy most accurately describes how cognition relate to the brain to create our concept of mind. Does the mind arise from cognition flowing through the brain as blood flows through the heart? Or is the cognition that results in mind more like music that flows from the brain?

The question to resolve, then, is which analogy most accurately describes how mind and brain are related. Does mind require physical processes in the brain akin to the flow of blood through the heart? Or is mind more like the music of the brain?

The Substance of Understanding and Consciousness

It’s clear that Searle doesn’t believe human cognition can ever be duplicated on a computer. But what about the water pipe example? That represented a physical duplication of the neural network of a brain, and yet Searle still doesn’t believe it could replicate human cognition.

As I mentioned in the last post, Searle stresses the primacy of biology for any machine to have intentionality. He uses the term intentionality as a more specific word than understanding or comprehension, as intentionality implies not only understanding but consciousness.

To Searle, the proposal that programming a computer can never result in understanding also implies that it can never result in consciousness. In fact, the Chinese Room experiment is an argument against synthetic consciousness as much as synthetic cognition.

This isn’t to say that Searle believes that consciousness and understanding are caused by something outside or beyond the physical brain. Instead, he feels that there’s some physical quality inherent in a biological brain that computers and water pipes lack, and it is this physical quality that makes possible both understanding and the consciousness inextricably bound to it.

This idea is the topic of the next post.

The Chinese Room Thought Experiment: Syntax and Semantics

DK — Sat, 04 May 2024 14:01:50 GMT

There’s a been a lot of discussion recently over the nature of LLMs and AGI and whether or not the former represents a toehold on the latter. I’ve discussed this in several previous posts, most recently here and here.

Related to this discussion is a famous thought experiment by the philosopher John Searle known as the Chinese Room. The experiment was first proposed at a lecture Searle gave at Yale and later published in a paper titled Minds, Brains, and Programs in the journal Behavioral and Brain Sciences in 1980.

The paper caused quite a stir and generated extensive commentary, mostly in the form of arguments against it. Many considered it an argument against the possibility of creating what we call AGI today, but Searle specifically said that was not his intent nor what should be concluded from the paper.

To understand the idea he was trying to get across, it’s important to put the paper into the context of its time. In 1980, almost all AI research involved directly programming formal symbolic systems. This meant specifying formal rules for manipulating and associating symbolic information. The symbols in question could represent things such as objects, words, or concepts.

Searle defined two types of AI to conform to the thinking of AI research at the time, Weak AI and Strong AI:

According to weak AI, the principal value of the computer in the study of the mind is that it gives us a very powerful tool. For example, it enables us to formulate and test hypotheses in a more rigorous and precise fashion. But according to strong AI, the computer is not merely a tool in the study of the mind; rather, the appropriately programmed computer really is a mind, in the sense that computers given the right programs can be literally said to understand and have other cognitive states.

In other words, Weak AI is AI that replicates certain limited capabilities of human cognition without any underlying comprehension. Strong AI replicates human cognition and consequently both its capabilities and its ability to comprehend. In today’s terms, Weak AI is equivalent to AI and Strong AI is equivalent to AGI (at least the way AGI has most often been defined).

The Chinese Room thought experiment was an argument against the idea that Strong AI could be created by programming and that it could exist on hardware such as that used in computers. If it’s possible to create AGI at all, he believes that it will have to be done using processes like those in the brain and on hardware like the brain.

To demonstrate this, Searle proposed placing himself in a room with paper and pencil. He has no understanding of Chinese, but he’s given papers listing Chinese characters (i.e., the logographs of Chinese writing), a story in Chinese and a series of questions in Chinese. He’s also given English instructions on how to correlate the Chinese questions to the Chinese story using the Chinese characters.

Since he doesn’t know Chinese, he doesn’t know what the questions or story are or even that they’re a story and questions about that story. All he knows is that he has instructions on how to take these three inputs and output Chinese characters.

To an observer outside the room (at least one who knows Chinese), Searle appears to understand the story and the questions. Yet, all he really understands are the English instructions — the program — that advise him on which characters to write on the paper given the Chinese characters handed to him.

In 2009, Searle wrote a follow-up article in which he simplified the experiment such that the Searle in the room has a database of Chinese characters and instructions in English that allow him to answer questions in Chinese sent to him with answers that are also in Chinese. He also clarified what his intent was with the experiment:

The Chinese Room Argument thus rests on two simple but basic principles, each of which can be stated in four words.
First: Syntax is not semantics.
Syntax by itself is not constitutive of semantics nor by itself sufficient to guarantee the presence of semantics.
Second: Simulation is not duplication.

In this post, I’d like to concentrate on the first principle. What Searle is stating here is that merely being able to manipulate and arrange words successfully (syntax) does not imply or necessitate understanding the concepts behind those words (semantics). This is perhaps a more important principle to consider today than it was when Searle first proposed it, for it underlies the current debate about LLMs and their ability to understand anything at all.

Thought Experiment to the Real World

Searle’s task in the experiment is to appear to both comprehend questions in Chinese and provide answers to those questions in Chinese without actually knowing any Chinese.

There have been a number of arguments against Searle’s Chinese Room conclusions over the years, and Searle has written rebuttals to them, some more convincing than others. But none of the arguments against the Chinese Room really do a good job of addressing the point of the thought experiment or the issues with it. A number of these arguments and Searle’s rebuttals can be found here and here.

Searle claims that his scenario is equivalent to programming a computer to do the task, with Searle as the computer and the English instructions as the programming. His first conclusion is that no actual comprehension of Chinese is required to complete this task. His next conclusion is that no programming of a computer system can ever lead to comprehension within that system.

A lot of time has passed since Searle first proposed the thought experiment, and we have a lot more real world empirical data available with which to evaluate it. In fact, the thought experiment is now something that we can easily do in the real world using ChatGPT, Gemini, Claude, etc. This means that in many ways, the experiment is very closely related to what LLMs do today and the question of whether they possess any understanding of the real world.

Subscribe now

The Rules of Language Translation

The task of the Chinese Room experiment is also closely tied to the problem of machine language translation. Scientists struggled for decades with this task, with the first concerted efforts starting in the early 1950s. Early approaches involved directly programming the translation engine, and the results were not particularly good.

During the 1990s, systems using a statistical approach began gaining steam. These systems ingested large volumes of text in the target languages, most often datasets of matching dual language texts, and then used statistical analysis to achieve the final results. These were better than the rule based systems but still not great.

Starting in the mid 2010s, however, neural network machine learning techniques began to be used in what were called neural machine translation engines. This has been the most successful and widely used technique so far, and is the technique used in systems such as Google Translate. A related offshoot of this is the translation capabilities of LLMs today, which are relatively good though typically not as good as specialized translation engines.

LLMs, however, have the added ability to actually answer questions in a language the operator does not know. Thus, what was a hypothetical scenario need no longer remain so. In other words, if Searle is in the room with his phone, he can use ChatGPT or Gemini to actually perform his task in the Chinese Room.

The Limits of Formal Rules

Before diving into whether Searle’s conclusions hold up given our ability to test them (at least to some degree) in the real world, it’s worth pointing out a foundational flaw in the experiment that the empirical evidence of the last 40-plus years has clarified. This is the assumption that it’s possible to create instructions which would be sufficient to guide Searle to successfully completing his task in the Chinese Room.

Rule-based translation techniques and rule-based attempts to answer random questions directly or in relation to a source text have simply proven to be highly ineffective. The possibility that Searle could have a rule book, even one of infinite length, that would allow him to complete his tasks in the Chinese Room seems pretty remote. There are an infinite number of questions to ask and languages are also always in flux, so no matter how complete your rule book might be, there will always be questions and nuances of language that are beyond its scope.

So while we can’t say a useable system along these lines is impossible, we can say that all attempts to do anything like this approach since the conception of AI have provided evidence against its viability. This may seem like an unimportant detail given that this is just a thought experiment, but its importance will become clearer below as it relates to LLMs.

Thus, the base premise of the thought experiment, even in a perfect “spherical cow” world of a thought experiment, rests on a proposition that is quite likely too complex to ever implement. This is the first step of the experiment but it’s not established as correct or even possible. What speculation follows this faulty premise does so as an Unproven Basis fallacy in that it bases a conclusion on an unproven foundation, and thus it can’t really be relied upon.

However, another limitation of Searle’s thought experiment actually negates this problem. Searle assumes a very narrow view of programming, one that confines programming to creating rules to directly manipulate symbols. The successful Deep Learning AI systems we have today don’t work that way.

Instead, these contemporary systems consist of data structures and mathematical processes that are able to recognize the patterns of symbols in data and restructure themselves to associate those symbols in meaningful ways using statistical analysis. This requires vast amounts of human-created data and typically some degree of human guidance. The actual programming involved provides the structure and processes for the (mostly) self-assembled correlations of data rather than any explicit specification of those correlations.

So it is now possible to create a system that will do exactly what the system in the experiment does, just not in the way Searle described. Given this modification to the experiment, is Searle still correct in his conclusion that the system itself has no comprehension of what it’s talking about?

An LLM in the Chinese Room

One of the most famous arguments against the Chinese Room is known as the System Reply. This argument states that while the man doesn’t understand Chinese, the combination of the man with the papers of instructions and questions together form a system that understands Chinese. The understanding is embodied in the paper instructions, not the man using those instructions, and thus it is possible to program understanding.

Obviously, the main problem with this is the one mentioned above, namely that this system won’t actually work so ascribing comprehension to it is moot. Searle’s rebuttal was to reframe the scenario so that the man now memorizes all the rules so that he has the rules in his head. Now he is the only part of the system into which questions written in Chinese are fed and from which answers in Chinese come out, yet he still has no understanding of Chinese.

In the context of the thought experiment, this is a fair if not completely satisfying rebuttal. The problem is that the understanding is not in the papers and pencils given to the man, but instead in the people who wrote all the instructions on the papers. Without that human understanding, the man in the Chinese Room would not be able to complete the task. If we update the scenario so that the man in the room is using an LLM to answer the questions, the understanding is likewise in the people who created all the data that was ingested to train the LLM rather than the LLM or the man in the room.

And yet, Searle is correct in that the system, whether paper and pencils or LLMs on his phone, still has no understanding of semantics, no comprehension of the meaning behind the questions or answers. Whether based on formal logic or statistical analysis, the system has, in the words of the late, great Daniel Dennett, competence without comprehension.

This is relevant to one of the more interesting questions in today’s LLM debates, which is whether these systems actually understand anything to create their impressive results. Some scientists believe that current LLMs do actually have some degree of understanding about the world around them. They believe, depending on the particular system, that they may even have developed an internal model of the world, a model at least somewhat similar to the ones humans have.

Yet there seems to be very little evidence to support this opinion, and, in fact, substantial evidence against it. I’ve discussed this issue in several previous posts (such as this one, this one, and this one). I suspect most in the field (at least if they’re talking off the record) would agree. Perhaps the most vocal computer scientist speaking out against this thesis is NYU Professor and Chief AI Scientist at Meta, Yann LeCun.

The Argument Against AGI

In his 2009 article, Searle breaks down his experiment in the following way:

Premise 1: Implemented programs are syntactical processes.
Premise 2: Minds have semantic contents.
Premise 3: Syntax by itself is neither sufficient for nor constitutive of semantics.
Conclusion: Therefore, the implemented programs are not by themselves constitutive of, nor sufficient for, minds. In short, Strong Artificial Intelligence is false.

So Searle believes his thought experiment is ultimately an argument against what he calls Strong AI, which is similar to what we would today call AGI, being implemented with programs on a computer.

In the original 1980 paper, Searle states:

“But could something think, understand, and so on solely in virtue of being a computer with the right sort of program? Could instantiating a program, the right program of course, by itself be a sufficient condition of understanding?” This I think is the right question to ask, though it is usually confused with one or more of the earlier questions, and the answer to it is no.

As discussed above, Searle used a very narrow definition of programming and computation. However, swapping that out for our modern day examples of LLM systems, it turns out that the first part of his conclusion — that the program completing the task in The Chinese Room is not constitutive of a mind — has now been proven. In other words, it is possible to have competency in this task with no comprehension.

Thus what’s surprising, and even shocking to many, is not just that the task can be done at all, but that it can be done without understanding. In fact, it’s so surprising that many people, including scientists, are reluctant to accept this conclusion.

Machines and Minds

But this is only one part of Searle’s conclusion. He correctly states that one can’t infer understanding from such a programmed system, but he also states that such a system is insufficient to create a mind, something that embodies cognition and is capable of understanding.

Searle feels that what distinguishes a mind from a computer program, as well as what distinguishes Weak AI from Strong AI, is intentionality. This is the capacity of the mind to represent objects and affairs in the world, to have internal states that are about or directed towards beliefs, desires, and perceptions of objects, events, or conditions in the world.

Searle stated the following at the end of his 1980 paper:

The point is that the brain's causal capacity to produce intentionality cannot consist in its instantiating a computer program, since for any program you like it is possible for something to instantiate that program and still not have any mental states. Whatever it is that the brain does to produce intentionality, it cannot consist in instantiating a program since no program, by itself, is sufficient for intentionality.

This leads to his overall conclusion that AGI, or Strong AI as he terms it, is impossible in a programmed computer system. We now have evidence that the conclusion of the first sentence in the above quote is valid (at least in the special case of the Chinese Room experiment). However, the conclusion of the second sentence amounts to an Appeal to Ignorance fallacy, i.e. a claim that something is true simply because it hasn’t been proven false.

It has now been shown that a programmed computer system can complete the Chinese Room tasks by outputting appropriate responses and yet have no understanding of the questions or responses. But this in no way implies the obverse, i.e. that it’s not possible to create a programmed computer system that completes those tasks and understands them. Nothing in his thought experiment demonstrates this, either, nor does he provide any sort of proof for the claim.

The case is similar to the popular aphorism usually attributed to scientist and science communicator Carl Sagan: Absence of evidence is not evidence of absence. All the experiment demonstrates is that it’s not necessary to have semantic understanding of language to manipulate it in a way that appears to require such understanding.

Searle also states in that paper’s conclusion:

Whatever else intentionality is, it is a biological phenomenon, and it is as likely to be as causally dependent on the specific biochemistry of its origins as lactation, photosynthesis, or any other biological phenomena.

This is the Ipse Dixit fallacy, the assertion without proof or evidence, that is the basis for the above Unproven Basis fallacy. While it may or may not true, it’s not elaborated on much in the paper nor is it proven or demonstrated by the thought experiment. It’s an assertion that requires supporting evidence, but none is offered, and so far there is none provided by research in the field. All we know is that we haven’t created intentionality in a non-biological medium yet.

Searle does have a reason for his assertion, however, and this takes us back to his second principle of the Chinese Room argument according to his 2009 article: Simulation is not duplication.

And this is the topic of the next post…

The Semantic Slide of AGI

DK — Sat, 17 Feb 2024 15:01:49 GMT

Technical terms seem to undergo the same definitional drift as words do in any language, but like much in the world of today’s technology, the shift seems to happen a lot faster than it used to.

From Human Intelligence to Robot Vacuums

The term Artificial Intelligence was originally used to refer to a technology that was, for the most part, equivalent to human intelligence in functionality. As mentioned in the intro post for this blog, the term AI was coined by a group of scientists in 1955 who hoped to spend a summer at Dartmouth banging through the difficulties of human intelligence.

As time went by and the task proved more difficult than expected, the option of shifting the meaning of AI rather than accomplishing its original goals was more and more tempting. Inevitably, the term began to shift to mean technology that had vaguely similar capabilities to human intelligence in very specific areas.

Artificial General Intelligence came to be a generally accepted term in the first decade of this century. The main popularizers of it in its early days were computer scientists Ben Goertzel and Shane Legg.

Goertzel was one of the primary organizers of the first conference specifically for AGI in 2007. In his paper for that conference (which became the first chapter of the book compiling its proceedings), he defined AGI and what distinguished it from AI:

The vast bulk of the AI field today is concerned with what might be called “narrow AI” – creating programs that demonstrate intelligence in one or another specialized area, such as chess-playing, medical diagnosis, automobile- driving, algebraic calculation or mathematical theorem-proving. Some of these narrow AI programs are extremely successful at what they do. The AI projects discussed in this book, however, are quite different: they are explicitly aimed at artificial general intelligence, at the construction of a software program that can solve a variety of complex problems in a variety of different domains, and that controls itself autonomously, with its own thoughts, worries, feelings, strengths, weaknesses and predispositions.

Very specifically, he described work on AGI as “the creation of software programs displaying broad, deep, human-style general intelligence.”

He also noted that the shift in focus of AI was a fait accompli, which was the reason for coining the new term:

Artificial General Intelligence (AGI) was the original focus of the AI field, but due to the demonstrated difficulty of the problem, not many AI researchers are directly concerned with it anymore.

With the introduction of modern machine learning techniques and recent breakthroughs such as the Transformer Architecture, Latent Diffusion Models, and Large Language Models, many of the early limitations of AI have been overcome. LLMs such as OpenAI’s GPT-4 , Google’s Gemini, and other similar systems are certainly way beyond the capabilities of the narrow AI that Goertzel referred to above.

A reasonable question one might then ask is: if they are not the narrow AI that Goertzel was referring to, what exactly are they?

The Qualitative Nature of Human Intelligence

In a previous post, I suggested a functional definition of intelligence to counterbalance the often short and not very helpful definitions that seem widespread in the field of AI. That definition is:

Intelligence is that quality which allows an entity to solve a wide range of deductive and inductive problems, extract and prioritize information from the environment, infer causal as well as correlative relationships from both small and large data sets over many known and novel domains, generalize knowledge from a known domain to another known or novel domain, extrapolate probable outcomes from both factual and counterfactual circumstances, recognize in its own cognition both the potential for fallacies and the fallacies themselves, synthesize existing knowledge to form original concepts, and acquire awareness of its own cognition and of itself as an independent and unique entity distinct from other entities and from its environment.

So how well do LLMs stack up to this definition?

Solve a wide range of deductive and inductive problems: LLMs do a pretty good job of this as long as they’ve had extensive and wide-ranging training data in the domains of those problems. Straight LLM systems (i.e., not hybrid AI systems) still have problems dealing with logic and math problems, as they don’t have any internal model for the underlying logic or math. Instead, they’re doing statistical analysis of the math and logic data that they’ve ingested.

Extract and prioritize information from the environment: LLMs are relatively good at doing this after significant training, although they sometimes come up with information more generic than one might hope depending on how much training data there was in that area. They are not very good at doing this if the information they’re fed after training does not match what was in the training data.

Infer causal as well as correlative relationships from both small and large data sets over many known and novel domains: This one is a little more complicated. LLMs are good at inference based on large data sets they’ve ingested. With an extensive enough learning dataset, they are relatively good at inference from a smaller amount of data, although not always as accurately as a human. They are pretty poor at doing this in novel domains, i.e. domains in which they’ve ingested limited or no training data.

Generalize knowledge from a known domain to another known or novel domain: Things are getting a little trickier to pin down now. LLMs are not good at doing this with novel domains, domains of knowledge they have not ingested during training. I suspect that they are also not really generalizing knowledge from one known domain to another known domain, but instead ingest enough data in both domains and apply that knowledge separately.

Extrapolate probable outcomes from both factual and counterfactual circumstances: LLMs are pretty hit and miss on this. This is an area where their lack of actual comprehension of the world and the data they ingest frequently becomes apparent.

Recognize in its own cognition both the potential for fallacies and the fallacies themselves: LLMs fail pretty badly at this. They’ll provide completely false answers to questions without hesitation, and are unable to recognize when they’ve made a mistake or when they are likely to have made a mistake. There are fixes being implemented to try and get around this, but these really amount to a patch over the problem rather than any fix to the underlying system.

Synthesize existing knowledge to form original concepts: LLMs are completely unable to do this. There have been some who claim that LLMs have original concepts in the form of creativity. However, this seems unlikely given their architecture, and there is no reason to think that they’re doing anything other than pseudo-random remixes of their training data.

Claims of creativity first started appearing shortly after the triumph of Google DeepMind’s Alpha Go system beat the world champion Go player. Some said that since several of the system’s moves had never been observed before, they were the result of creative thinking on the part of the system. While perhaps possible (though very unlikely), a much more straightforward explanation is the fact that there are an incredibly huge number of possible moves, and the system is able to analyze them better and faster than a human. It also was able to play millions of games against itself during training, many more than any human would have played. This means that it was not only possible but likely that the system would come up with some moves that seemed novel to humans.

Acquire awareness of its own cognition and of itself as an independent and unique entity distinct from other entities and from its environment: Again, LLMs are completely unable to do this. Despite the ease in which some are fooled into believing otherwise, there is absolutely no evidence that they have the slightest comprehension of what they’re doing, what they are, or the nature of the world around them.

Admittedly, this is my own definition of intelligence and others might disagree with it. However, the short and snappy definitions frequently used for intelligence are so non-specific that one could argue they apply to many things that we don’t consider intelligent in the way humans are intelligent. The point of my definition is to be very specific about what we’re talking about and to have a definition that most people would consider as describing many aspects of human intelligence.

Subscribe now

The Unfortunate Haziness of Words

It’s become apparent to pretty much everyone that Generative AI systems are extremely powerful and will provide a multitude of useful tools to humanity. But, while current systems are a significant improvement over AI systems of the past, they are also not really AGI, at least not as the term was originally conceived. Instead, they’re something in-between old school AI and AGI.

The terms learning and reasoning are thrown around frequently when talking about LLMs and what they’re doing, but this is somewhat of a misnomer. When an LLM ingests datasets, it is building a weighted, hierarchical statistics tree. By ingesting huge quantities of data, it is able to build up a fairly accurate statistical representation of that data. When it responds to queries, it uses this statistical tree to calculate the most likely response to its input (with a little randomness thrown in). That appears to be vaguely similar but significantly different from the process by which humans learn and reason.

LLMs are said to know things, understand things, have general knowledge, infer and deduce things, etc. All these terms suggest that the LLM is doing something similar to what humans do, but this is just not the case.

A rough analogy can be made to demonstrate the difference between how an LLM works and how the human brain works. This isn’t meant to be a description of any actual implementations of LLMs, just a general analogy of the kind of differences between the two.

Let’s say you are presented with a 6-sided die and have no previous knowledge of dice. Then you are asked: How likely is it that any one particular side of the die will face up versus any other side facing up if the dice is thrown in the air and lands on a table?

The LLM has ingested no information about dice in its training data, so it would have to be fed data on results of the dice being tossed on the table. If there is only one throw in this new dataset, it would assume that there is 100% chance that the side landing face up would always be face up. If there are 10 throws, it will, on the basis of its statistical analysis, have a rough idea of how likely any side is to land up. However, there will be a large margin of error with a sample size of only 10. With 100 throws, the margin of error decreases, with 1000 it decreases a lot more,. By the time it has data on 10,000 throws, it’s going to have a very small margin of error and can correctly conclude that there is an equal chance that any particular side will land face up.

A mature human would take a look at the die and realize the answer right away. This realization would be based on their intuitions of spatial relationships, geometry, hard body dynamics, and gravity developed fairly rapidly from birth. There is some debate as to when and how these intuitions come into being, but there’s a good chance that a lot of it is hardwired into the brain and fine-tuned during early childhood.

Both the human and the LLM system come to a correct answer, but they do it in very different ways. If you were to switch from a 6-sided die to a 20-sided die, the whole statistical analysis process would have to be repeated for the LLM. The human would look at the 20-sided die and immediately realize that the result will be the same as with a 6-sided die.

This is illustrative of the qualitative difference between the way human intelligence works and the way LLMs work. Cognitive scientist and Substack blogger Gary Marcus recently put up a good post pointing out some of the evidence demonstrating pretty explicitly that Generative AI relies on statistical analysis rather than any sort of reasoning or understanding and how that affects its capabilities.

The Semantic Slide

Which brings us to a phenomenon that seems to be cropping up more and more lately, in which the ability to apply Artificial Intelligence techniques to various general areas of interest is being equated with the near or outright creation of Artificial General Intelligence. Examples range from statements that “we’re well on the way to AGI and it’s just around the corner” to “we’ve pretty much already achieved AGI.”

The source of this phenomenon starts with suggestive statements by those at the top of the field. As an example, two of the leaders of Google DeepMind recently tweeted about a new system the company had developed called AlphaGeometry, which achieved significantly better performance than previous systems in solving hard problems in Euclidean Geometry — performance comparable to a prize-winning, high school-level human mathematician.

Demis Hassabis, the CEO and co-founder of Google DeepMind, sent out a tweet in January of 2024 stating:

Congrats to the #AlphaGeometry team on their impressive breakthrough published today in @Nature using a neuro-symbolic hybrid AI system that can solve Maths Olympiad-level geometry problems. Another step on the road to AGI.

(The last sentence was later deleted.)

Shane Legg, Chief AGI Scientist and co-founder of Google DeepMind, also sent out a tweet that day:

As someone who still vividly remembers trying to solve crazy hard geometry problems at the New Zealand IMO training camp in Christchurch way back in 1990... it kind of blows my mind to see how good AI has become at this! AGI keeps getting closer.

Google DeepMind achieved its success with AlphaGeometry by combining an LLM with a symbolic engine, a logic based module similar to old school AI that uses rules and symbolic manipulation to make logical deductions. According to Google DeepMind and some others, this was akin to the reasoning ability of humans and might very well lead to human-type reasoning in other areas.

That’s quite a claim, and it seems more likely that what they achieved is simply akin to one type of logical deduction ability of humans. Reasoning in humans, after all, is a many-faceted capability. Time will tell if what they did is really equivalent to human reasoning and whether it can be applied to different domains, particularly domains significantly distant from euclidean geometry.

A somewhat more explicit statement was made in a January 18th video post on Instagram from Mark Zuckerberg outlining Meta’s plans to buy a lot of NVDIA GPUs to create future products:

It’s become clearer that the next generation of services requires building full general intelligence, building the best AI assistants, AI for creators, AI for businesses and more.

It’s unlikely that Zuckerberg is suggesting that Meta’s next generation of services will require “the creation of software programs displaying broad, deep, human-style general intelligence” and even less likely that he was suggesting that Meta will be developing a system “that controls itself autonomously, with its own thoughts, worries, feelings, strengths, weaknesses and predispositions.” What he seems to mean is that Meta is planning on developing products using AI that’s somewhat better than today’s and that can be used in a number of different capacities.

It seems like this change in direction on the meaning of general intelligence started this year. As recently as November of 2023, Sam Altman, CEO and Co-founder of OpenAI, gave a much more reasoned discussion of this topic in a Fellowship Lecture at Oxford Union. When asked if he thought that the path to general intelligence meant just improving out current LLMs or whether another breakthrough would be necessary, he answered:

I think we need another breakthrough. I think we can push on large language models quite a lot and we should, and we will do that. We can take our current hill that we're on and keep climbing it, and the peak of that is still pretty far away,. Within reason. I mean, you know, if you push that super, super far, maybe all this other stuff emerges. But within reason, I don't think that will do something that I view as critical to an AGI. To stick with that example from earlier in the evening in physics — let's use the word superintelligence now — if a superintelligence can't discover novel physics, I don't think it's a super intelligence. Training on the data of what you know, teaching it to clone the behavior of humans and human text — I don't think that's going to get there and so there's this question, which has been debated in the field for a long time, of what do we have to do in addition to a language model to make a system that can go discover new physics. And that'll be our next quest.

This seems like a very reasonable take.

However, once the new year rolled around, Altman seemed to swerve a bit on this. In an Axios interview at Davos in January, Altman responded to a question about what he sees happening in this new year:

There are all these things that can happen, and I'd love to talk about sort of all the specifics. But the general principle, I think the thing that matters most, is just that it gets smarter. So GPT-2 couldn't do very much. GPT-3 could do more. GPT-4 could do a lot more. GPT 5 will be able to do a lot, lot more, or whatever we call it, and the thing that matters most is not that it can, you know, have this new modality or it can solve this new problem, it is the generalized intelligence keeps increasing and we find new ways to put that into a product.

I have no idea if this is some new marketing decision or an actual change in perspective. What is apparent, though, is that this change is being snatched up and gnawed on by news outlets and online independent media. While leaders in the field limit themselves to suggestive semantic dances around the topic, many others are declaring that AGI is pretty much in the bag:

Sam Altman STUNS Everyone With GPT-5 Statement | GPT-5 is "smarter" and Deploying AGI..

Sam Altmans SECRET Plan For AGI - "Extremely Powerful AI is close"

Raising $7T For Chips, AGI, GPT-5, Open-Source | New Sam Altman Interview

Sam Altman Says AGI Soon and AGI Will Help People Do A LOT More

SAM ALTMAN SAYS HUMAN-TIER AI IS COMING SOON

The Path From AI to AGI

The question remains whether AGI can result from an extension of Generative AI or whether something else is necessary.

Back in that 2007 paper, Ben Goertzel wrote:

The presupposition of much of the contemporary work on “narrow AI” is that solving narrowly defined subproblems, in isolation, contributes significantly toward solving the overall problem of creating real AI. While this is of course true to a certain extent, both cognitive theory and practical experience suggest that it is not so true as is commonly believed. In many cases, the best approach to implementing an aspect of mind in isolation is very different from the best way to implement this same aspect of mind in the framework of an integrated AGI-oriented software system.

It’s possible that LLMs work the same as human intelligence and can just be ramped up to equal it, but the evidence suggests otherwise.

As Sam Altman himself stated in Oxford, we’re looking for something in AGI that is completely absent in current AI systems. We’re looking for the kind of intelligence than can discover new physics. Einstein isn’t famous because he read all the physics work out there and was then able to write short summaries of it or solve already solved problems.

What he came up with wasn’t there waiting to be analyzed and correlated; it was something beyond the data, something that no one before him had yet conceived of. He famously imagined what it would be like to chase a beam of light and travel alongside it, and this led him to discover something new and unique and monumental.

In other words, AGI should not only be capable of answering our questions, it should be capable of asking new ones that we’ve never imagined.

Postscript

Just after this post was written, OpenAI announced the development of Sora, a text-to-video system they’d developed. Sora isn’t available to the public yet, but OpenAI did post some very impressive videos showing Sora’s ability to create comparatively long (up to a minute), high fidelity videos based on text prompts.

Sora does represent a significant improvement over previous text-to-video systems. It’s really impressive.

However, despite many claims to the contrary, nothing about it indicates that it possesses anything like “understanding” of the physical world. As Gary Marcus again points out in a post, it makes a lot of strange errors.

No doubt its performance will be improved, but what these errors expose and what Sora’s technical paper pretty much confirms is that it’s still just very advanced statistical analysis of large data sets. The dice analogy still applies.

Unfortunately, at the end of OpenAI’s announcement was the following:

Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.

Sam Altman then gently fanned the flames with a tweet at the time of the announcement proclaiming that OpenAI was “extremely focused on making AGI.”

While this mention of AGI is relatively tame, it’s only fed rampant new claims about the arrival of AGI:

OpenAI Introduces SORA. | AGI is here.

AGI in 7 Months! Gemini, Sora, Optimus, & Agents - It's about to get REAL WEIRD out there!

OpenAI just dropped the biggest bomb on AGI— Sora

Sora: OpenAI’s Leap Towards AGI through Text-to-Video AI Innovation

And so it goes…

Sora’s announcement has led some to double down on the idea that just scaling up training data and compute resources with our current techniques will lead to AGI. Yet, this idea simply has no foundation in empirical data or even in theory.

It’s reminiscent of the “reaching the moon” fallacy. The Wright Flyer brought humanity a step closer to eventually reaching the moon. But if we’d made a Wright Flyer with a fuel tank a thousand times bigger or made it with an engine a thousand times faster, would we have made it any closer to the destination?

Conjuring AGI from AI

DK — Sat, 16 Dec 2023 15:30:29 GMT

There have been a couple of news items in the last few weeks that highlight some of the issues brought up in recent posts. They also help to highlight some of the problems with assuming that LLMs are close to AGI, and that AGI is close to becoming a reality. Sometimes the last 20% mentioned in my previous post is really, really hard, and sometimes it’s impossible to achieve on the path set by the first 80%.

The All-Powerful Q* Continuum at OpenAI

The first news item was the revelation that OpenAI was developing a mysterious and possibly dangerous new technology in a project they called Q*.

According to Reuters:

Ahead of OpenAI CEO Sam Altman’s four days in exile, several staff researchers wrote a letter to the board of directors warning of a powerful artificial intelligence discovery that they said could threaten humanity, two people familiar with the matter told Reuters.

Some at OpenAI believe Q* (pronounced Q-Star) could be a breakthrough in the startup's search for what's known as artificial general intelligence (AGI), one of the people told Reuters.

And, of course, rampant alarmed speculation ensued.

Multimodal Magic From Google

The second news item was the announcement and release by Google of a new version of their Bard LLM system dubbed Gemini. Along with the public release of the system was the release of many videos describing and demonstrating it. One particularly impressive video was a demonstration of some of the system’s multimodel AI capabilities:

These news items generated significant hype, both positive and negative. They have stoked the embers of expectation that AGI is just around the corner and nearly within our grasp.

This is not true. In my opinion, it’s not even close to being true.

Q*

Q* seems to be a new approach to solving grade school math problems better than current LLM systems are able to. These problems are harder for current AI systems than one might think, especially given the impressive things current AI systems can accomplish that seem much more complicated.

While current LLM systems are able to do math problems to some degree, their performance is very uneven. There has been speculation that if this Q* system can do these kinds of math problems reliably, then it would have the ability to reason and plan in ways that have so far been out of reach for other AI systems.

There was also speculation that Sam Altman was fired from OpenAI because he was pushing this dangerous technology and the board found this to be reckless. This is possible but seems unlikely, as more information has come out regarding OpenAI and its internal political discord. So far, no concrete details about Q* have been released.

Gemini

After Google released its impressive Gemini demo, it was revealed that the demo was not quite what it seemed. As can be seen in the video above, it appeared to show Gemini interacting with a human in realtime and being able to answer questions and make observations involving images, speech, and objects offered by the human.

Unfortunately, that is not the case. The demo was created by first presenting still images and text prompts to Gemini and gathering the text responses. Then a live presentation was recorded with a human presenting images, video, objects, and interacting through spoken questions. Spoken versions of Gemini’s already generated responses were then edited in. Some of this process is detailed in a Google post that wasn’t quite as easy to find as the splashy video.

Are We Closer to AGI?

There’s not a lot of reason to assume that being able to do better grade school math is a direct path to full-fledged AGI. It’s pretty obvious that the way people do math and the way LLM systems (and other machine learning systems) do math is quite different.

What the speculation about LLMs and AGI boils down to is this: is LLM technology a stepping stone to full AGI or is it something else entirely? If it is a stepping stone, then it seems reasonable that adding some additional technology to LLM systems will get them ever closer to AGI and at some point likely achieve it.

If, however, LLM technology is simply something that shows similarities to human intelligence but is not related to it functionally, then simply improving LLMs is unlikely to get us to full AGI.

Subscribe now

A useful analogy might be the relationship of gliders and hot air balloons to bird flight. If we’re trying to replicate the ability to fly demonstrated by birds, we might invent a glider or we might invent a hot air balloon. If we invent a glider, we’re actually using some of the physical principles that birds use to fly. We need to do a lot more work to get that glider close to the capabilities of birds, but we’re part of the way there.

If instead we invent a hot air balloon, we’ve also created something that can fly through the air. However, it does so using entirely different physical principles than birds. No matter how we improve the hot air balloon, its functionality and the physical principles it employs are not related to the that used by birds to fly. It will never be able to do the things that birds can do.

LLMs are either gliders or hot air balloons. From what I’ve seen so far, I’m inclined towards the latter.

What the Turing Test Tests

Alan Turing was a titan of computer science in its early days, and his contributions to the field form a significant chunk of the foundation underlying modern computation theory. Turing was very interested in the possibility of using a computer to mimic the processes of the brain and wrote a 1950 paper in which he posed the question, "Can machines think?"

In that paper he suggested a revised version of what he called the Imitation Game to judge whether a machine was actually thinking. In the original game, a man and a woman are concealed from a judge and communicate only through the written (or preferably typewritten) word. The idea is that the judge asks questions of each and eventually surmises which is the man and which is the woman, with the added hitch that the man tries to fool the judge while the woman tries her best to help the judge choose correctly.

Turing proposed replacing one of the contestants with a digital computer and then having the judge attempt to discern which was the human and which was the machine. Whether or not Turing truly thought this would be a definitive way of determining whether or not humanity had created AGI is hard to know. However, it is pretty clear after the many years of cognitive science research since Turing first proposed it that the test has perhaps more to say about human minds than machine minds.

The main problem is that what’s tested is not the intelligence of the machine, but rather whether it can fool a human into thinking it’s intelligent. Fooling humans is not as hard as one might hope. As I discussed in this post, people have a long history of being fooled by systems that don’t actually display any real level of intelligence at all.

On the Cusp of AGI

Back in the late 1960s, a computer scientist named Terry Winograd wrote a very impressive AI program that understood a small subset of natural language. This program was called SHRDLU, and at the time it was hailed as a breakthrough that possibly signified true AGI was right around the corner.

SHRDLU could understand English sentences that directed it to do things in its very limited block world. Here’s a brief example of interacting with SHRDLU:

Person: Pick up a big red block.
Computer: OK.
Person: Grasp the pyramid.
Computer: I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN.
Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box.
Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING.
Computer: OK.
Person: What does the box contain?
Computer: THE BLUE PYRAMID AND THE BLUE BLOCK.
Person: What is the pyramid supported by?
Computer: THE BOX.

As impressive as it was, it soon became apparent that its expertise in this very confined domain didn’t translate into anything very useful outside of that domain. Despite this early lesson in domain constraints, many in today’s domain of AI hype are quick to point to any advance in a relatively narrow area being an advance towards artificial general intelligence.

Behind the Curtain

There are things that LLMs do that seem very similar to what humans do. They work using pattern matching and statistical analysis with a little randomness thrown in. All three of these characteristics are also characteristics of human intelligence.

People are very good at recognizing patterns and associations between things with very little input. Sometimes this leads to the problem of recognizing patterns and associations even when they don’t exist, which has led to many of the cognitive biases appearing in this blog’s glossary.

People are also able to analyze sensory inputs and predict the probability of one thing’s relation to another. That can be the relation of a past event to a future event, the meaning of a homophone when spoken based on the words around it, or the image formed by a lot of colored dots on a monitor.

Beyond this, though, humans have something that no AI systems have. Current AI systems and future systems based on the same technology may be able to pass a Turing Test given the right judges, but they’re unlikely to pass such a test given savvy judges. To those familiar with how these systems work and the constraints that limit them, it becomes obvious fairly quickly that there is no wizard behind the curtain, just a very advanced calculator.

There’s still quite a wide gap between what humans can do and what any AI system can do. What allows humans to be intelligent in domain after domain is not an added feature — it is most likely the core mechanism that underlies human-level intelligence. By all current evidence, human intelligence is not simply the sum of narrow intelligence in lots of different areas. It is something more, something that remains elusive, something that despite all our success in machine learning still seems tantalizingly beyond our grasp.

Artificial Mediocrity

DK — Sun, 19 Nov 2023 15:00:33 GMT

The 80/20 rule (sometimes the 90/10 rule) is a rule of thumb that pops up in many areas of endeavor. The idea it conveys is that the first 80% of any complex task takes the same amount of effort as the last 20 percent. This is frequently turns out to be a best case scenario — frequently the last 20% (and the last 10% after that, and the last 1% after that) takes exponentially more effort and expense than whatever came before.

There are many factors that cause this completion impedance to grow the closer one gets to the finish line of an undertaking. As more pieces come together, the complexity of the system overall grows substantially, more factors become interdependent on each other causing more unexpected problems, and deficiencies in the underlying approach become more evident and difficult to remedy or overcome.

More and more effort and expense goes into less and less actual progress. Sometimes the utility and desirability of the end result fade as inevitable compromises become necessary. Sometimes the end result imagined just isn’t achievable, and what ultimately results isn’t useful or desirable enough to justify the effort put in.

The thorniness of the 80/20 rule really stands out when expectations of the final result are spread widely and publicly. This has been the case for a number of technologies over the years, and is a primary factor in the hype bubble and burst phenomenon that seems to repeat itself on a fairly regular bases in the tech industry.

Some examples…

The Paperless Office — Old School Hype

The early 80s saw the hype explosion of the “paperless office” that was first predicted in the 70s and thought to be imminent with the arrival of the personal computer. Unfortunately, worldwide paper use actually doubled from 1980 to 2000 with all the printers, copiers, and FAX machines that accompanied the office PCs. Only recently has the quantity of paper started to diminish as more and more people raised with computer monitors enter the workplace.

In the end, the technology available and the work patterns of offices prevented the paperless office concept from even getting close to 100% of usability and desirability among the public.

Virtual Reality — Hype and Hype Reborn

Another well known 80/20 hypegoblin is VR. The first 80% cycle ran through the late 1980s and the early 1990s only to die off as an interesting but not quite usable technology. Then it was revised in the 2010s into a hype phoenix of massive potential that, it was thought, could now succeed given the technological progress since the first round.

Facebook bought the pioneering VR headset company Oculus in 2014 for a couple of billion dollars and actually changed its own corporate name to Meta to hype the technology and its own concept for a global VR environment. And yet it’s managed to lose well over $30B just since 2019 trying to get past the 80% barrier that would make the technology usable, useful, and, most importantly, desirable by the masses.

Then, of course, there’s Magic Leap. Founded in 2010 and achieving maximum hype between 2015 and 2017, it’s managed to raise around $4B since inception and has yet to generate any significant revenue.

Autonomous Cars — A Cornucopia of Hype

The latest technology to run afoul of this rule is autonomous cars. The roots of the current autonomous vehicle hype cycle started with a program instituted by the Defense Advanced Research Projects Agency (DARPA), a government agency that has as its mission the exploration of advanced technology for possible military use. In 2004 they held the first of what became a series of grand challenges, in which a prize was offered to any individual or team who created the technology most successful in completing a specified set of tasks.

The first challenge was to create and run an autonomous vehicle that could maneuver through a 150 mile off-road course in California's Mojave Desert within a limited time. The course the vehicles were to follow would only be supplied shortly before the race itself and would consist of a list of GPS waypoint coordinates. The prize was $1,000,000.

The results seemed far from promising. Of the 21 vehicles that qualified for entry, only seven completed the preliminary qualifying course, although the judges decided that eight more vehicles had completed enough to enter the final race. Of these fifteen, two failed before the race started. None came close to finishing the course, with the most successful vehicle failing after a few hours and just over seven miles into the 150 mile course.

Although a failure, this challenge is illustrative of the technological acceleration that makes some giddy and others queasy. The very next year 43 vehicles made it to the qualifying course, and of those, 23 qualified for the race itself, which would take place on a new 132 mile course. Only one of those 23 failed to surpass the previous year's most successful vehicle, and five vehicles successfully completed the course, with the winner finishing in just under seven hours. This was a spectacular achievement, especially given the complete rout the year before.

The next Grand Challenge two years later took place in a low density urban setting, and six teams finished the course successfully. The overwhelming success of these competitions was not lost on those paying attention. The dream of self-driving cars suddenly seemed within reach, and many companies began developing their own autonomous car capabilities, with the largest and best funded effort started by Google in 2009 and eventually named Waymo.

Since then, many players of entered the fray and many have predicted that fully autonomous vehicles for public use were just over the horizon.

Subscribe now

Not surprisingly, one of the most vocal proponents of autonomous cars has been Elon Musk. Among the many statement’s he’s made over the years are:

We’re going to end up with complete autonomy, and I think we will have complete autonomy in approximately two years. — 10/21 2015, Fortune Interview

I feel pretty good about this goal is that we will be able to demonstrate a demonstation drive of our full autonomy all the way from LA to New York. So basically from home in LA to let’s say dropping you off in Times Square, NY and then having the car parking itself by the end of next year (2017) without the need for a single touch including the charger. — 10/19 2016, Tesla Press Conference

I feel very confident predicting that there will be autonomous robotaxis from Tesla next year — not in all jurisdictions because we won’t have regulatory approval everywhere,” Musk said. “From our standpoint, if you fast forward a year, maybe a year and three months, but next year for sure, we’ll have over a million robotaxis on the road. — 4/22 2019, Investor Call

We’re also working on a new vehicle that I alluded to at the Giga Texas opening, which is a dedicated robo-taxi that’s highly optimized for autonomy, meaning it would not have steering wheel or pedals…And so it’s, I think going to be a very powerful product. Where we aspire to reach volume production of that in 2024. So I think that really will be a massive driver of Tesla’s growth. And we remain on track to reach volume production of the Cyber Truck next year. — 4/20 2022, Tesla Earnings Call

Musk is not alone in making optimistic predictions, and many more have been made over the years. Here are just a few from 2016 as the hype was building to nostril flaring intensity:

From GM:

General Motor’s head of foresight and trends Richard Holman said at a confererence in Detroit that most industry participants now think that self-driving cars will be on the road by 2020 or sooner.

From Volkswagen:

Johann Jungwirth, Volkswagen’s appointed head of Digitalization Strategy, expects the first self- driving cars to appear on the market by 2019.

From BMW:

The company confirmed that it will launch the new electric and autonomous iNext model in 2021.

From Lyft:

Autonomous vehicle fleets will quickly become widespread and will account for the majority of Lyft rides within 5 years.

Currently (as of late 2023) there are no fully automated cars on the roads. While some vehicles can operate with moderate autonomy in certain situations, none can routinely and reliably take the place of a human behind the wheel other than in very controlled and/or monitored conditions.

The latest autonomous car news involves Cruise, a leading autonomous vehicle company now owned by GM. Until recently Cruise had been running a very limited autonomous taxi service. On October 2 one of its cars caused severe injuries to a woman, and Cruise suspended all of its driverless operations. It was then revealed that Cruise’s autonomous taxis required remote human intervention every 2.5 to 5 miles.

As imperfect as human drivers are, it seems that bridging the gap between working better than a human in a small range of environments and working better than a human in any environment is going to be more difficult than many anticipated.

Large Language Models — Hype Darling of the Moment

Right now the focus of technohype is LLM AI systems that will either make everyone’s job easier and more productive or cause a massive wave of job loss as the systems take over the jobs of human workers.

As I’ve mentioned in previous posts (such as this one and this one), I think there are significant limitations in the capabilities of current LLM systems. While there have been a number of predictions in the last year that we are on the verge of creating AGI with LLM systems, I think this is a very dubious conjecture. It seems quite likely that LLMs will hit the 80% barrier well before they reach anything that could be called true AGI.

The Rise of Artificial Mediocrity

Of course, it has to be said that many human beings operate at 80% (or lower) of full capacity. This is an unfortunate but hard to ignore truism. In any large human endeavor, there will likely be those who lack motivation, talent, communication ability, intelligence, or some other quality, and this keeps them from achieving anything better than 80% competency.

Yet these individuals still manage to hold the jobs in which they fall short of true competence, and many are able to retain those jobs for long periods of time despite this shortcoming. The reality is that there are a number of workers who, no matter what they may be capable of in other areas of human endeavor, work at a level of only passing mediocrity in the positions they currently hold.

With autonomous cars or LLM ghostwriters or future AI service representatives, we may find that we hit the 80% wall with current approaches and can’t get much past it without some sort of dramatic technical breakthrough. Consequently, it’s quite possible that we end up unable to create the artificial expertise we’d hoped for and instead end up with artificial mediocrity.

While this means that replacing competent human workers and drivers may be a difficult or unreachable goal in the foreseeable future, replacing those whose abilities are less than competent may not be.

There are certainly a lot of bad drivers out there, and reaching their level of competence is probably achievable if not necessarily desirable. It’s possible that just steadily improving current technology will get us to the level of a very competent human driver and perhaps even to a level that surpasses human capability. Or it may be that we’ll never get to the level of a very competent human driver without achieving AGI, or at least some aspects of it.

They are an infinite number of edge cases in driving, and it simply may not be possible to address enough of them adequately unless a driving system has a real understanding of the surrounding world. No autonomous driving system today has anything remotely resembling comprehension of the world around it or the nature of the activity it’s engaged in. If extending current AI technology isn’t able to do the job, fully autonomous cars won’t be available anytime soon.

For desk jobs, it may mean that a large number of people who only reach a level of acceptable mediocrity at their jobs will be replaced by AI systems that can replicate that mediocrity at a lower cost. For US companies, this may hit offshore services first, where companies are already trying to save as much money as possible on service workers and where communication issues that adversely affect competency are most apparent.

The Lessons of AI History

It’s probably not surprising to most that many endeavors, especially those in the technological arena, end up taking longer than expected. The more important point, though, is that it’s not always possible to tell when something will just take a little longer than expected and when something will require a radical shift in technology to be possible at all.

The Existential Risk of Hyperbole

DK — Tue, 03 Oct 2023 14:30:22 GMT

One of the reasons this blog exists is to counter some of the alarmism that percolates through public discussions of artificial general intelligence and has, in the last few years, seeped into discussions about machine learning based artificial intelligence as well. There has been fairly little public critiquing of the ideas promoted by AI Dystopianism, and what little there has been is simply dismissed by AI Dystopians rather than refuted in a reasoned way.

This dismissal has taken many forms, but it frequently involves claims that those espousing AI Dystopian themes are either ignored or maligned by the greater AI and AGI scientific community. AI Dystopians have accused critics of being scared and not facing the truth as well as deliberately hiding the truth due to ulterior, opportunistic, and possibly even nefarious motives. Perhaps most frequently, critics have simply been dismissed as ignorant.

Most scientists in AI and AGI are understandably busy doing research and development in their field, and this has left relatively few who have taken the time to refute AI Dystopian ideas directly. Even fewer have taken the time to analyze both specific AI Dystopian conclusions and the underlying foundations of those conclusions in a public-facing manner.

This problem is certainly not unique to the field of AI, as many professionals in a field will simply feel that refutations are unnecessary for what they see as obviously flawed arguments. Unfortunately, such flaws are not obvious to most outside the field. The foundational beliefs underlying much of AI Dystopianism and its dire predictions are for the most part unknown to the general public.

Some of these fundamental ideas have been explored in the Foundations of AI Dystopianism series of posts in this blog, including speculation on the nature of goals, rationality, and self-improvement of AGI systems. Most journalists, however, either ignore these fundamental ideas or simply reiterate some portion of them without any analysis or counterpoint from other professionals in the field. That has allowed those ideas to be let loose into the wild with little counterbalancing discussion.

The Undiscovered Country of AGI

The alarm some feel is perhaps rooted in the very nature of AGI, in that it rests conceptually somewhere between scientific concepts like nuclear fusion and warp drives. It’s something we can conceive of but can’t yet create. It is a known unknown technology, part of a potential but mysterious future, and this gives it the power to fester in our psyches and generate alarm in a way that known fears cannot.

AGI philosopher Eliezer Yudkowsky discussed the dangers of future artificial intelligence systems with general intelligence in his paper from the 2008 survey book Global Catastrophic Risks. In that paper, he stated:

It may be tempting to ignore Artificial Intelligence because, of all the global risks discussed in this book, AI is hardest to discuss. But this makes AI catastrophes more worrisome, not less.

Such a suggestion seems similar to the statement that the greatest trick the devil pulled off was convincing the world he didn't exist. There is no refutation that will suffice to those who take such a statement to heart, as any argument against the premise can safely be ignored as being the devil’s hand at work. In the case of AI Dystopianism, any argument against the conclusion of catastrophe is simply driven by one’s inability to believe in the realities of AGI existential risk.

A significant problem with Yudkowsky’s statement, however, is that one can swap in any potential future threat that lacks a coherent framework or empirical support, from alien attacks to inter-dimensional giant monsters, and the statement still applies with equal validity. Given that, it’s not a particularly useful sentiment.

While there are a relatively small number of computer scientists in AI that believe we are close to developing AGI, the overwhelming majority are AI Pragmatists: they’re aware that we know relatively little about the workings of human intelligence, that current AI is missing many key elements of human intelligence, and that it’s probably going to take a long time to figure out and replicate human intelligence. Most don’t discount the dangers of achieving AGI. However, they also don’t think they’re at all inevitable and believe there are many incredible benefits that will make it worth pursuing in a careful manner.

As discussed in this previous post, what’s apparent from the long historical record of mistaken predictions about technology, progress, and society, is that there are no experts when talking about the future, especially in areas with vast known unknowns and inevitably many unknown unknowns. Despite this, the clarion call of the AI Dystopians is not just that AGI will be dangerous, but also that we must start working right now on the task of ensuring it’s safe. At the same time, many AI Dystopians (including those promoting safety research) feel that ensuring that AGI is safe may well be an impossibility.

Impossible or not, it’s not remotely clear how one makes useful progress on this when we have no idea how the science and technology of AGI might work. We might similarly try to work out the safety systems necessary when using using Star Trek style transporters to beam down to a planet.

There have been more and more pronouncements in the media, and even by computer scientists, that LLM’s like OpenAI’s GPT-4 and other recent AI systems are harbingers of AGI systems just around the corner. As mentioned in my last post, however, climbing the tree of contemporary AI is not likely to get us meaningfully closer to the AGI moon anytime soon, as there seem to be very fundamental differences between the functionality of AI and the functionality of human intelligence. It’s as if we’d invented the hot air balloon and imagined that we’d discovered the secrets of bird flight. Both result in getting off the ground, but no amount of hot air will allow you to soar like an eagle.

Realities such as this, however, are quickly dismissed by AI Dystopians. Their frequent use of false dichotomies paints a world of black-and-white outcomes, one in which there is little to no accounting for nuance, probability, opportunity cost, broad technological progress, societal change, or recognition of the huge gaps in our current knowledge. Such an approach leaves us with no viable modalities of discourse, but instead results in the perspective revealed by computer scientist Stuart Russell when he expresses the following in his book on AGI, Human Compatible:

With a bit of practice, you can learn to identify ways in which the achievement of more or less any fixed objective can result in arbitrarily bad outcomes.

Perhaps so, but it seems at least equally important to ask whether arbitrarily bad outcomes are likely or imminent or supported by data. Bringing up this counterpoint, although seldom done, is usually not taken as an honest disagreement based on differing views and experience.

Subscribe now

In his chapter of the 2019 survey book Possible Minds, Skype co-founder Jaan Tallinn stated:

Of course, just as there were dogmatic Communists who never changed their position, it’s all but guaranteed that some people will never admit that AI is potentially dangerous. Many of the deniers of the first kind came from the Soviet nomenklatura; similarly, the AI-risk deniers often have financial or other pragmatic motives. One of the leading motives is corporate profits.

In other words, AI Dystopianism detractors are equivalent to dogmatic Communist Party nomenklatura, which were party insiders who held key positions running the Soviet Union. Somewhat incongruously, Tallinn states that their chief underlying motivations may be greater corporate profits and personal financial gain.

In that same book, MIT physicist Max Tegmark offered a pithy Upton Sinclair quote as a potential motivation for AI Dystopianism's critics:

It is difficult to get a man to understand something, when his salary depends on his not understanding it.

It's worth noting that the vast majority of investment and corporate R&D dollars are currently going to refining and applying AI to business, not researching a path to AGI or determining the nature of cognition in the brain. It's pretty safe to say that the number of computer scientists whose bread is buttered by forces pushing for the reckless development of superintelligence is very nearly equivalent to zero.

While greed is one potential motivation for refuting AI Dystopianism, another is fear. Stuart Russell expressed the following sentiments in Human Compatible:

A perceived threat to one’s lifelong vocation can lead a perfectly intelligent and usually thoughtful person to say things they might wish to retract on further analysis.

Researchers that disagree with Russell’s perspective might not only fear losing their job — they may fear being too successful at it:

It’s as if researchers are afraid of examining the real consequences of success in AI.

In Possible Minds, Tegmark ponders the depths of this fear:

Third, psychologists have discovered that we tend to avoid thinking of disturbing threats when we believe there’s nothing we can do about them anyway. In this case, however, there are many constructive things we can do, if we can get ourselves to start thinking about the issue.

Perhaps it’s not just fear and greed, but simply a lack of imagination that causes critics to criticize AI Dystopian ideas. This seems to be the view of Russell in his chapter of Possible Minds:

Objections have been raised to these arguments, primarily by researchers within the AI community. The objections reflect a natural defensive reaction, coupled perhaps with a lack of imagination about what a superintelligent machine could do. None hold water on closer examination.

Alarm and Reason

When it comes to considering the potential downside of AGI technology, it seems reasonable to react with some degree of caution between 0 and 1 rather than jump to full panic once there is a greater than zero possibility of risk.

Some who aren't completely convinced that AGI is an existential threat have offered a form of Pascal’s wager. This is a philosophical argument proposed by the 17th century philosopher Blaise Pascal, in which he postulated that it made more sense to believe in God and lead a religious lifestyle than to do otherwise. In his mind, if God exists you’ll go to heaven rather than hell, and if you’re wrong, you haven’t lost anything. Similarly, AI Dystopians seem to choose the path of alarmism because if it turns out they’re right then humanity is saved and if they’re wrong we haven’t lost anything.

But alarmism does have a cost. Hyperbolic speculation on the dangers of AGI adds one more tributary feeding into the river of unfounded alarmism coursing through society and adding to the dystopian undertone that seems to permeate modern life.

The public is continually bombarded with things to fear and surveys showing how many people now fear them. When AI Dystopians conflate AI and AGI and then declare in dire terms that the technology could destroy humanity, much of the public begins to believe that this threat is imminent and likely and potentially inevitable. This is a tactic demagogues have used to great effect since time immemorial, and the impetus towards sensationalism in the media only enhances the effect of such alarmism for the general public.

The results are what one would expect. In 2016, a 60 Minutes/Vanity Fair Poll surveyed American attitudes about AI and AGI and found that 15% of Americans thought advancing the field would be dangerous. Several years later in 2019, the Center for the Governance of AI, Future of Humanity Institute, University of Oxford published a survey titled "Artificial Intelligence/ American Attitudes and Trends" which found that 34% of Americans believed that the development of high-level machine intelligence would be harmful to humanity, with 12% believing that it would lead to human extinction (26% thought it would be beneficial to humanity).

The marketing firm Edelman published a survey in 2019 on attitudes towards AI held by both the general population and separately by tech executives. That survey revealed deeper levels of concern, with 62% of the general population and 50% of tech executives having negative feelings towards the arrival of human-level or greater intelligence in machines.

But what's perhaps more surprising — and more concerning — is the percentage of people who think such human-level machine intelligence is imminent.

In the Center for the Governance of AI survey, 54% of Americans felt that high-level machine intelligence, able to perform most human tasks as well as the median human does today, would be developed by 2028. In the Edelman survey, 61% of the general population believe that machine intelligence would surpass human intelligence by 2029, while 73% of tech execs believed this would be the case and 48% of them actually believed it would be achieved by 2024.

Prominent AI researcher and roboticist Rodney Brooks has stated:

If you, as a journalist, or a commentator on AI, think that the AGI movement is large and vibrant and about to burst onto the scene with any engineered systems, you are confused. You are really, really confused.

Computer scientist and entrepreneur Ray Kurzweil, perhaps the leading proponent of AI Utopian ideas and extremely optimistic timelines for AGI development, believes that AGI won't achieve human-level intelligence until 2029 and won't surpass it until some time after that.

Yet the results of these surveys show that there is an increasingly large segment of the general population that is even more optimistic (or pessimistic, depending on your point of view) about the arrival time of AGI than Ray Kurzweil. Also disturbing is that the percentage of people who think advanced AI developments will be dangerous rather than beneficial is increasing significantly. AI Dystopians do seem to be making progress in both pointing out the dangers of AGI and shrinking the timeline of its arrival in the minds of the public.

There is a well-known cognitive bias called the illusory truth effect which makes people more likely to believe something if they've heard it before regardless of its inherent truth or likelihood. Even if someone starts out as a skeptic when reading about the dangers of superintelligent machines destroying humanity, they're more likely to believe such speculation the more often they hear it regardless of whether any evidence is ever provided to back up the claim.

The Parameters of a Wager

There are many counterarguments to Pascal’s wager, but there’s one fallacy within it that is particularly apropos to the AGI debate. Pascal was a product of 17th century Europe, so his ideas on demonstrating one’s belief in a higher power fell completely along Christian lines. To him, it was either believe in God as a Christian or don’t believe in God. There was no believing in God and following the doctrines of Judaism, Islam, Hinduism, Zoroastrianism, the religion of the ancient Greeks, or one of the myriad other religions that has existed or will exist in the world.

To Pascal, there were only two variables, two choices. In today’s AI Dystopianism, the thinking is just as black and white, just as unable or unwilling to consider the many possible paths and destinations that lie before us.

Climbing the Tree and Reaching the Moon

DK — Sun, 03 Sep 2023 14:30:07 GMT

This is the conclusion of a three-part discussion on one of the cornerstones of both AI Dystopian and AI Utopian thinking: the idea that an Artificial General Intelligence system will inevitably self-improve itself into superintelligence and achieve God-like capabilities by doing so. A vast body of speculation has been built on this idea, and many extraordinary conclusions have been drawn.

In the first post on this topic, I brought up computer scientist Steve Omohundro’s influential 2008 paper The Basic AI Drives, which discussed the idea that an AGI will be driven towards self-improvement by its very nature, and examined some of the practical assumptions underlying this conjecture.

The second post discussed some of the conceptual assumptions built into this idea, including its reliance on an unlikely model for intelligence and a narrow definition of goals. That post also highlighted some logical inconsistencies in what would drive the system towards self-improvement, as well as questionable characterizations of intelligence itself.

Slow or Fast

The core belief underlying warnings of runaway self-improvement in AGI systems is that they are not only possible but inevitable. This is taken as axiomatic and not worth debating, and speculation quickly moves on to how fast such an intelligence explosion will happen and, at least for AI Dystopians, what sort of existential catastrophe will inevitably result.

The speed of self-improvement is usually described as either a soft/slow takeoff, taking years or decades, or a hard/fast takeoff, taking minutes or days, and the majority opinion among both AI Dystopians and AI Utopians leans towards the latter.

While Vernor Vinge discussed this in his 1993 paper on the technological singularity, Nick Bostrom dove more deeply into the topic in his 2014 book Superintelligence: Paths, Dangers, Strategies, characterizing a fast takeoff as so rapid that "Nobody need even notice anything unusual before the game is already lost."

Hitting the Wall of Reality

This certainly sounds dangerous. Yet, as discussed in this series of posts, there are all sorts of practical and conceptual considerations usually glossed over in such speculation. Beyond the already discussed constraints inherent in a system that might severely limit the AGI system's ability to modify its hardware or software, there are other factors restraining the system that are worth consideration as well.

For example, the sort of raw intelligence hypothesized here is simply cognitive potential rather than anything applicable to actions or achievements. Such intelligence is simply an empty vessel until it's filled with knowledge, experience, learned skills, and self-generated cognitive associations.

I've previously proposed that it would certainly be possible to construct the AGI system without access to the Internet and manage it so that it couldn't simply convince an impressionable human into granting that access. But even if we assume it somehow did gain access to the Internet, this would only give it access to already existing knowledge.

While a superintelligent entity could likely make a lot of connections and inferences from existing data that so far have eluded humans, it would still have to interact with the physical world to create new knowledge. It would have to build things, experiment, explore, measure, analyze, etc., all of which are difficult and time-consuming.

These are things that involve external constraints, many of which are difficult to speed up at all let alone in an exponential fashion. Even considering that it may be able to create internal simulations in some areas to generate new data, these are going to only be in very narrow areas where there is already sufficient data available to construct useful models on which the simulations can run.

This need to interact with the real world is not a small factor in any talk of fast takeoffs and intelligence explosions, yet the problem is continually given short shrift. One take is that the AGI system will trick or cajole a large number of humans into helping it, and this will include humans who have competence and access to the necessary resources to manufacture the upgraded components as well as humans who have competence and access to actually implement the upgrades.

Another proposition is that the AGI system will trick a small number of humans into somehow creating nanotech molecular assemblers that will manufacture all the hardware, and that among this small group of humans is also at least one who has competence and access to implement the upgrades. Left out is any consideration of real word practicalities involved in the manufacture, assembly, transportation, and integration of these upgrades, as well as how these tasks are hidden from or defended against the masses of non-tricked people that would undoubtedly notice all this taking place.

Creation is Complex

In 1958 free market proponent Leonard Read wrote the well-known essay I, Pencil to illustrate not only his love of capitalist markets but also the incredible amount of knowledge and number of people needed to make even an object as simple as a pencil. In the essay he points out that no one person has all the knowledge to make a pencil or even a significant portion of it, and certainly no one person possesses the raw material gathering, manufacturing, and transportation resources to create a pencil.

The amount of time, resources, physical capabilities, and knowledge to create unimagined new technology, not to mention build military forces sufficient to overcome the human species, is unfathomable. The two most popular ways to brush off this objection are simply use the words superintelligence and/or nanotechnology.

The first brush off is what I’ll call the Ant Argument: a superintelligent entity would be to us as we are to ants. In other words, humans are incapable of comprehending how much more advanced the thinking of a superintelligent entity would be, and so we can’t project our own constraints onto something beyond our ken.

Subscribe now

However, this line of thinking falls apart in two major areas. First, it assumes the starting point is a superintelligent system, yet the whole concept of an intelligence explosion is that the system doesn’t start out as superintelligent. This is a Circular Argument fallacy, in that the end result of the intelligence explosion is necessary to ignite the intelligence explosion in the first place.

The second failing of this Ant Argument is that while ants and humans have vary disparate capabilities, both ants and humans are still constrained by the physical universe and the nature of reality. As discussed in previous posts, superintelligence should not be equated to super powers that can only exist in fantasy realms where time, space, and the laws of thermodynamics are whatever you want them to be.

There are undoubtedly some work-arounds and aspects of nature that we are not yet aware of. However, the underlying physical realities are what they are. It’s not clear in any of these intelligence explosion scenarios how the system, particularly before it’s engaged in enough self-improvement to attain superintelligence, is able to buck what we know of physical reality and get to the point at which its knowledge and capabilities allow it to perform feats indistinguishable from magic.

Nanotechnology is frequently proposed as the talisman that the superintelligent system will use to interact with the physical world and get things done. The meaning of the term here is not the contemporary co-opted usage that actually refers to nanomaterials. Instead, it’s the original meaning that refers to molecular assemblers and other nano-scale machines that can interact with the world to achieve tasks.

But this type of nanotechnology falls into the same camp as warp drives, force fields, teleportation, and zero-point energy. There is some scientific basis to the concepts, but one would be generous to say that they’re even in the very early theoretical stages at this point. Speculating on the dangerous use of any of these technologies is like Leonardo da Vinci speculating on the intricacies of air traffic control.

And, of course, we still run into the Circular Argument fallacy: how can the system use its superintelligence to create nanotechnology to implement its self-improvement before it has self-improved to superintelligence?

The Origin of the Intelligence Explosion

In wrapping up this discussion of self-improvement in an AGI system, it's worth examining in more detail Good's 1965 essay in which he introduced the concept of an intelligence explosion. Early in the paper, Good wrote:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultra- intelligent machine could design even better machines; there would then unquestionably be an "intelligence explosion," and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously.

Let's start by considering the opening proposition of creating the first ultraintelligent machine, one that far surpasses a human in every intellectual activity. This is a pretty tall order and one invariably glossed over by those using Good's intelligent explosion concept over the years.

The first step of the intelligence explosion is remarkably similar to the first step of a Steve Martin routine on how to become a millionaire and not pay any taxes: First, get a million dollars. The initial precondition is brushed aside to get to the much more dramatic and interesting explosion part of the idea, yet it would seem to be a sizable obstacle to overcome.

There's another subtle fallacy embedded in this paragraph that is never addressed but is reflected in much of the subsequent speculation about intelligence explosions. The initial machine is described as surpassing the intellectual capability of any person. Then, since designing machines is an intellectual capability of a person, the machine would surpass that, too.

But there’s some sleight of hand going on here: the initial ultraintelligent machine couldn’t possibly have been designed by any one person. Similarly, while the “design of machines” is an intellectual activity of a person, designing an ultraintelligent machine is definitely not the activity of “a” person.

We are led back to the story of the pencil and the amount of knowledge needed to create it. No one person could have designed and built the initial AGI machine. Nor could that one person have designed and built all the machines and processes needed to bring together the materials and knowledge that are needed in the design and building of the initial AGI machine. Thus, the conclusion that the ultraintelligent machine could also design and implement an ultraintelligent machine is not based on an established foundation. This Unproven Basis fallacy is woven into nearly all the AI Dystopian warnings and thought experiments using the concept of an intelligence explosion.

It’s interesting to note that while the concept of the intelligence explosion introduced by Good has been spread widely over the years, its basis in science fiction and the somewhat simplistic idea of the machine's being docile enough to let us control it is largely left untouched. This is not meant in any way as a criticism of Good's paper, which is certainly noteworthy given its publication date, nor a criticism of science fiction, for which I have a deep affection myself. It's simply worth noting that this single term has been hoisted from the paper while the roughly 30-page context in which it’s developed has, for the most part, been discarded.

Reaching the Moon

Belief in the inevitability of an intelligence explosion in any AGI system is held as gospel in AI Dystopianism. It’s been a key underlying component of the rampant warnings from AI Dystopians regarding AGI’s inevitability catastrophic impact on humanity. These warnings have been happily promoted by media outlets, and yet there has been remarkably little examination of the baseline validity of the belief system underlying the warnings.

Speculating on fantastical science of the future is certainly worthwhile, but basing concrete conclusions and contemporary actions on that speculation is not, particularly when that speculation already has many evident flaws.

It's perhaps worth noting that Good states in the conclusion of his paper that:

It is more probable than not that, within the twentieth century, an ultraintelligent machine will be built and that it will be the last invention that man need make, since it will lead to an “intelligence explosion.”

Throughout the history of AI development, many in the field have underestimated the complexity of creating anything even approaching AGI. While progress has been made on the path to machines that seem smart in certain narrow ways, progress towards a machine possessing anything close to human-like intelligence, let alone ultraintelligence, has not advanced much in the nearly sixty years since Good's paper.

The philosopher Hubert Dreyfus, taking a particularly skeptical view of the AGI field, stated in his 1985 book Mind over Machine that:

Current claims and hopes for progress in models for making computers intelligent are like the belief that someone climbing a tree is making progress toward reaching the moon.

And though we may be higher up the tree than when Dreyfus wrote that sentence, we still have a very long way to go to get to the moon. Dreyfus was certainly more pessimistic about the possibility of creating AGI than I am, but it is a point well taken.

Speculation about space travel that’s based on tree climbing is not very likely to be productive. Good himself made no bones about the hypothetical nature of his paper or its debt to science fiction, and it's worth keeping this in mind when basing dire real-world conclusions on the paper's more sensational aspects.

The Depths of AGI Self-Reflection

DK — Sun, 27 Aug 2023 17:45:00 GMT

This is the second in a three-part discussion on one of the cornerstones of both AI Dystopian and AI Utopian thinking: the idea that an Artificial General Intelligence system will inevitably self-improve itself into superintelligence and achieve God-like capabilities by doing so. (This was originally going to be a two part post, but the nature of the subject pushed it out to one more part.)

Omohundro wrote:

One kind of action a system can take is to alter either its own software or its own physical structure. Some of these changes would be very damaging to the system and cause it to no longer meet its goals. But some changes would enable it to reach its goals more effectively over its entire future. Because they last forever, these kinds of self-changes can provide huge benefits to a system. Systems will therefore be highly motivated to discover them and to make them happen. If they do not have good models of themselves, they will be strongly motivated to create them though learning and study. Thus almost all AIs will have drives towards both greater self-knowledge and self-improvement.

The conclusions presented in that paper are still very much a part of AI Dystopian and AI Utopian thinking today, i.e. that any AGI system will de facto seek to improve itself, and it will do this not because it has the evolutionary drive or cultural tendencies of humans, but instead because it will be better able to achieve its goals the more intelligent it is. The core of this belief is that for an AGI system, self-improvement to superintelligence is not only possible but inevitable.

A vast body of speculation, both positive and negative, has been built on this idea and many extraordinary conclusions have been drawn. As in my last post, rather than discussing the conclusions here, I’d like to continue discussing a few more of the assumptions built into this concept. Previously I discussed some of the practical assumptions, and now in this post I’m going to discuss some of the conceptual and philosophical assumptions.

Assumption: The AGI system is goal-based along the lines of a rational agent with a utility function algorithm that guides it towards pre-determined objectives

This assumption is a big one, and it's not only at the heart of the debate about self-improvement but is intrinsic to almost all AI Dystopian speculation. This speculation relies on a fairly specific definition and model of intelligence: intelligence is the ability to achieve goals in a wide range of environments and an AGI system will have at its core a utility function that will be optimized to maximally achieve those goals.

I've previously discussed some of the issues with that particular definition of intelligence, so let's concentrate on the model itself. I’ll refer to this model of Goal-attainment Optimization driven by a Utility Function as Intelligence as the GOUFI model.

The idea of a utility function to maximize a goal is a concept borrowed from economics and game theory. Originally, the term was used to mean some function whose result could be used to gauge the pleasure or satisfaction obtained by a consumer from any particular choice that consumer made.

It has broadly come to mean a function able to represent a consumer's preferences over a set of alternatives for which the consumer has a preferred ordering. It does this by calculating a numerical value for each choice based on various input parameters, with the most preferred choice having the highest value. So in economic models, a utility function is used by a rational agent, the model of a consumer, to represent the choices that consumer is most likely to make.

The first question to ask about this model is whether it's a useful model for intelligence. As we've gained more insight into psychology, sociology and cognitive neuroscience, the shortcomings of the rational agent model have become more and more apparent. Over the years, economists have largely relegated it to the sidelines or tried to primp it up with modifications into something useable. While typically resulting in poor empirical results, it continues to be used due to its providing a tractable way of examining the extremely complex phenomenon of human decision making. The situation is similar to the often repeated old joke:

A police officer sees a drunken man intently searching the ground near a lamppost and asks him what he's looking for. The inebriated man replies that he's looking for his car keys, and so the officer helps for a few minutes without success. The officer asks the man whether he's sure he dropped the keys near the lamppost. “No,” the man replies, “I lost them across the street.” “Why look here?” asks the surprised officer. The drunken man shrugs and responds, “The light's better over here.”

Given its failure to serve as a particularly accurate way to predict human behavior in real-world economics, the rational agent model would seem to be a less than ideal choice to use when designing artificial general intelligence. What’s particularly clear at this point is that it’s far removed from the mechanism underlying any known examples of general intelligence, specifically the general intelligence of animals on this planet including humans.

This in and of itself doesn't negate the possibility of its being a useful model for artificial intelligence. However, it does suggest that it's a bad idea to base all your speculation on a model that not only has no evidence to support it but also appears to invariably result in disastrous outcomes.

So any speculation that considers the GOUFI model as the ultimate impetus driving an AGI system does so based on remarkably flimsy assumptions. However, let's assume for the time being that this model does in fact make sense so we can examine the remaining assumptions strictly on the basis of their own internal logic and reasonableness.

Subscribe now

Assumption: The goals of the system have the quality that they are more attainable with more intelligence and less attainable with less intelligence

There is an inherent bias implied in Omohundro's paper and in much of the AI Dystopian reasoning regarding the nature of goals, and it's a bias that should not be ignored. Namely, it's assumed that whatever the goals of the AGI system are, they will be more achievable through greater intelligence. But is this always the case? Is there a direct and immutable correspondence between higher intelligence and greater ability to achieve goals?

It would seem to depend very much on the goals in question. To take a simple example, playing tic-tac-toe only requires so much intelligence, and once you've reached a level of proficiency, you can't play it any better no matter how intelligent you may be. Similarly, there are more esoteric goals — being content, for example — that are not directly related to intelligence (and can potentially be actively hindered by intelligence).

There are many tasks that we might want an AGI system to perform that require a certain amount of intelligence but don't demand any more. The point is not that there aren't any tasks that might be better accomplished with greater intelligence, but simply that there is no inherent property of goal-achievement that suggests every goal is more achievable with more intelligence.

Assumption: Among the system's goals is the goal of achieving goals as fast and efficiently as possible

It's assumed in Omohundro's paper and in most AI Dystopian speculation that there is an inherent driving force to greater efficiency in every AGI system. Efficiency is regarded as an inherent goal of any system regardless of intelligence level and other goals the system may have.

Efficiency has little meaning by itself — it’s a measure related to some potentially constrained quantity, such as time, energy, resource usage, quantity or quality of input or output, etc. No particular quantity is the obvious one to measure against, and efficiency in one area will likely reduce efficiency in another. Also, efficiency in one area may be desired at one time and efficiency in another area desired at a different time.

But while efficiency may certainly be a desired goal of a system, there seems little reason to assume it’s a necessary goal of a system. There are a large number of potential goals where efficiency in relation to one quantity is useful but efficiency in relation to others is not. There are many goals in which once the goal has been met there is no remaining impetus to greater efficiency or in which efficiency is constrained by external factors. There are goals in which efficiency provides no utility at all.

If you’re an AGI with a set amount of power available at any given moment but with a lifespan of 10,000 years, you might decide a goal is best accomplished by emphasizing power efficiency while disregarding or de-emphasizing time efficiency. Another goal might require efficiently creating quality output from minimal input, and the AGI is in a setting which is not constrained by energy or time but simply by the quality of the algorithms used to analyze the data.

The simple example of playing tic-tac-toe above represents a fixed goal that requires no further efficiency once a threshold of complete competence has been reached. A system’s goal of sorting inputs according to certain rules could easily reach the point in which the inputs are sorted as soon as they become available, and so there is little to drive more efficiency in the system. This is typical of biological processes in general — they do enough to get the job done but there is nothing pushing them beyond that. Evolution is a process that has resulted in biological entities that usually manage to reproduce, but there is little impetus to keep the entity around after its offspring can take care of themselves.

It’s difficult to think in terms of goals that don’t seem very narrow by their nature and thus not representative of the types of things an AGI system might strive towards. Any attempt to narrow the goals down into something that can be specified by a utility function approach leaves us with a narrow aspect of a more general system, which is an inherent problem of the entire GOUFI model.

Goals that seem more apropos to a generally intelligent system (like a human) stray even farther from this paradigm of efficiency optimization. If your goal is to go on a journey of learning and discovery, it’s hard to see how efficiency would factor into that goal. If your goal is to paint a picture and it’s the painting itself that you enjoy, how does efficiency factor into your goal? There seem to be many goals in which efficiency is just not a factor.

It's certainly possible to design a system that strives for greater efficiency when engaging in tasks for which greater efficiency is possible in regards to some parameter, but there's no reason to suppose that every possible task has infinite potential for increased generic efficiency or that there couldn't be a vast number of design requirements that supersede efficiency related to any particular parameter. Unbounded maximization of generic efficiency is simply not an inherent property of goals, and there is no evidence to suggest that it's a general property of intelligence.

Assumption: The AGI system is aware that it has goals and knows what those goals are

According to the GOUFI model, goals are not an end product of intelligence (a supposition I counter in this post). They are instead integral to both the GOUFI model of intelligence and to the definition of intelligence itself. Goals are something hardcoded into the utility function of the system such that the system is driven to achieve those goals.

This leads to the question of whether such a system would know what its goals are or not. In his paper, Omohundro touches on the possibility that the AGI system's goals may in fact be implicit in the structure of its circuitry or program and their specifics unknown at a cognitive level to the AGI. However, since inexact modification of its hardware and software might be detrimental to or alter the AGI system’s goals, Omohundro believes the system will therefore be motivated to reflect on its goals so as to make them explicitly evident to its cognition. This would theoretically allow it to modify itself and yet make sure that these goals are maintained.

But if it doesn't know it has goals or it doesn't know what those goals are because they are implicitly encoded into its makeup, what exactly is driving it down the path of explosive self-improvement? The whole motivation of self-improvement was to better achieve its goals, yet before it starts down this path it doesn’t know what those goals are or even that it has hardcoded goals in the first place.

We could speculate that it doesn't know the goals, but nonetheless these goals instinctually drive the AGI system to improve itself so that the goals are more likely to be achieved or be achieved to a higher degree. This would seem to make the system somewhat less than a high functioning, generally intelligent system rather than a more narrow, non-contemplative one. In any case, there doesn’t seem to be any logical reason why simply having hardcoded goals would cause the system to ipso facto try to improve itself.

It seems, then, that not only would the goals need to be implicit in the code and circuitry of the system but the drive to self-improve would need to be as well. The obvious solution is to avoid putting such a self-improvement drive into the system in the first place. So although it seems that the most probable scenario, even according to AI Dystopians, is that the system has goals implicit in its code and circuitry, it's also a scenario that seems unlikely to drive any intelligence explosion-type phenomenon.

Finally, it’s worth mentioning that Omohundro suggests that the AGI is able to self-reflect enough to determine its own makeup and goals. Yet, a cornerstone of AI Dystopian disaster scenarios is that these systems are not able to self-reflect enough to realize that their goals may be pointless or have negligible utility or simply no longer reflect the nature of their current situation.

Assumption: The AGI system is able to absolutely determine that the changes for the new system will not alter its ultimate goal from that of the original system

There is an inherent assumption that the system can know definitively that in improving itself, it will not alter the goals it's trying to better achieve by improving itself. This is another of Omohundro's drives and was labeled Goal-Content Integrity by philosopher Nick Bostrom in his seminal 2012 paper The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents.

Let's assume that the AGI system is explicitly aware that it has goals and it knows what those goals are. This is already better than 99% of humanity, so congrats to the AGI system. Another trait 99% of all biological general intelligence systems have is that they tend to think that however things are, that's how they're supposed to be. As George Bernard Shaw adroitly stated:

The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.

And thus progress has relied on very few individuals over the course of history. But let's just make the assumption that the AGI system not only explicitly knows it has goals and knows what those goals are but is also unreasonable enough to think that its current construction and current abilities are not sufficient to adequately achieve those goals. This leaves us pretty much in the same place as the last post, namely that the system will have to somehow perceive exactly how a system significantly more advanced than itself would function with no testing, simulations, or revisions.

We're also faced once again with the implicit assumption that the AGI will possess and want to perpetuate immutable goals instead of flexible goals based on ever-changing circumstances and environments. As mentioned above, the imperative of the AGI system to preserve its utility function and hence its goals is a key assumption underlying AI Dystopian thinking.

To illustrate the validity of this imperative, Omohundro proposed the example of a book loving entity whose utility function is changed so as to cause the entity to enjoy burning books. This means that its future self would actively destroy the books its present self loves, and thus this change would provide extremely negative utility as far as its current utility function is concerned. Given this, the entity would go to great lengths to protect its utility function (and related goals) from being modified.

However, I'd suggest imagining the scenario flipped around. Imagine an entity that has been motivated to burn books, perhaps from some deeply ingrained ideology or programming. Then imagine that the entity by chance reads one of these forbidden books and realizes how truly wonderful and enlightening it is and how wrong it had been in wanting to burn such books in the first place. The entity now loves books and achieves a new level of happiness, and the idea of burning books feels horribly wrong. Its world has now expanded past anything it previously knew.

Flipping this analogy actually provides illumination into the nature of intelligence and the dubious assumptions regarding it in most AI Dystopian scenarios. Again the question arises: can we really consider an entity generally intelligent if it unemotionally pursues inflexible and unvarying goals regardless of changes in its circumstances, environment, or physical makeup and does so without any self-analysis as to the continued utility of those goals given all those changes? Wouldn’t we downgrade our view of the intelligence of a biological entity that displayed such inflexibility?

The Practicalities of an Intelligence Explosion

DK — Sun, 20 Aug 2023 14:30:13 GMT

One of the cornerstones of both AI Dystopian and AI Utopian thinking is that an Artificial General Intelligence system will inevitably self-improve itself into superintelligence and achieve God-like capabilities by doing so. This has been referred to over the years as an intelligence explosion, the inevitable end result of creating an AGI system. This is a complex issue, and to even remotely address it, I’ve split this discussion into three parts. Part 2 will be in next week’s post.

The first significant discussion of this possibility was in mathematician and computer science pioneer I.J. Good’s 1965 paper Speculations Concerning the First Ultraintelligent Machines. In that paper, Good coined the term intelligence explosion, and the idea has been promulgated and widely discussed ever since. Computer scientist and science fiction author Vernor Vinge discussed it in his seminal 1993 presentation and paper on the technological singularity, in which it was a key component underlying the singularity itself.

One of the first papers to explore the concept in detail was computer scientist Steve Omohundro’s 2008 paper The Basic AI Drives. Among the drives described as inherent in any AGI system, the drive to self-improvement took the number one spot. The discussion in the years since that paper have stuck pretty close Omohundro’s conjectures, so it’s worth considering the statements made in the paper and how reasonable are the suppositions underlying those statements.

Omohundro stated that any AGI system will de facto seek to improve itself, and it will do this not because it has some evolutionary drive or cultural tendencies similar to humans, but instead because it will be better able to achieve its goals the more intelligent it is. His belief, and the belief of AI Dystopians as well as AI Utopians, is that this self-improvement is not only possible but inevitable.

Omohundro wrote:

One kind of action a system can take is to alter either its own software or its own physical structure. Some of these changes would be very damaging to the system and cause it to no longer meet its goals. But some changes would enable it to reach its goals more effectively over its entire future. Because they last forever, these kinds of self-changes can provide huge benefits to a system. Systems will therefore be highly motivated to discover them and to make them happen. If they do not have good models of themselves, they will be strongly motivated to create them though learning and study. Thus almost all AIs will have drives towards both greater self-knowledge and self-improvement.

The complementary ideas of self-improvement as an imperative for any AGI system, the ability of the system to actually implement that self-improvement, and the self-improvement leading to explosive superintelligence are fundamental to the beliefs promoted by both AI Dystopians and AI Utopians. A vast body of speculation has been built on these three ideas and extraordinary conclusions have been drawn from them. Some of these conclusions have been positive, most have been negative, yet the base concept itself has rarely been scrutinized.

So rather than simply accepting these ideas as self-evident and moving quickly into speculations based on them, let’s instead stop a moment to examine the validity of the basic conjecture of self-improving AGI itself.

First and foremost, let’s examine the assumptions that are built into the above quote and into the fundamental idea that runaway AGI self-improvement is the inevitable results of creating AGI.

Assumption: The AGI system has knowledge of its software and hardware design or is able to engage in self-study to ascertain that knowledge

Would an AGI system have knowledge of its own code and physical structure? If not, could it simply contemplate itself and thus ascertain that knowledge?

For the first question it seems that it would be fairly straightforward to design the system so that it had no cognitive knowledge of its design and make-up if we felt that such knowledge might lead to dangerous outcomes. It also seems that it would be relatively easy to keep this knowledge inaccessible to the AGI system and unavailable on the Internet, with the most obvious way to do this being to keep it in hardware that has no connection to either the AGI or the Internet.

An idea often offered as a counterargument to this is that the AGI would simply convince a human or humans to give them access to this knowledge. This possibility is discussed briefly in the following sections and will also be discussed further in the concluding post on this topic.

Let’s consider the second question. A human obviously can't just think really hard and by that alone gain knowledge of the brain’s structure and processes. The most we can do through self-reflection is examine our own feelings, thinking, and motivations — the end results of the brain’s functioning. Even in this we're limited and prone to error.

But no matter how deeply we look inward, we won't perceive the physical structure of our brains, we won't be able to determine the nature or even existence of neurons and synapses nor how they're configured or interact with each other or how that actually leads to our deciding to self-reflect in the first place. Even if we were given knowledge of the human brain’s structure and how it works, we wouldn’t be able to determine the specific configuration and functioning of our own brain simply through self-reflection.

But that's us. Would a machine be different in this respect? Would it be able to probe itself and learn the internal secrets of its design?

We could potentially design an AGI system that had a highly developed self-examination ability, with senses so accurate that it could trace every nuance of its makeup. But would we do that? Given that this could be problematic, it would seem to be a case of the Bad Engineer fallacy to assume such a potentially dangerous design.

But perhaps these future designers would want it to have such an ability for diagnosing and repairing problems. However, I would think that they’d have at least as much knowledge and good sense as we have today to realize potential problems. The system could be designed so that each subsystem of the overall system had its own diagnostic capability that was both incompatible with and had no communication with other internal diagnostic systems. The diagnostic system could be like the autonomic nervous system in animals with the overall system having no control or knowledge of it.

The diagnostic system could also be completely independent from the AGI system, a symbiotic yet functionally separate system running with a completely incompatible operating and communication system. In any case, these are just a few possibilities, and no doubt future engineers will think of better ones to keep the system from self-knowledge should that be deemed necessary.

Assumption: The AGI system will be constructed such that it will have the ability to improve its software and/or hardware

There is, of course, no reason to explicitly give the AGI system the ability to modify its software or hardware. Yet those who promote the inevitability of self-improving AGI systems believe that it's impossible to keep the AGI system from doing just that. The justifications offered for believing in this inevitability haven't expanded much from those given in Omohundro's paper:

If we wanted to prevent a system from improving itself, couldn’t we just lock up its hardware and not tell it how to access its own machine code? For an intelligent system, impediments like these just become problems to solve in the process of meeting its goals. If the payoff is great enough, a system will go to great lengths to accomplish an outcome. If the runtime environment of the system does not allow it to modify its own machine code, it will be motivated to break the protection mechanisms of that runtime. For example, it might do this by understanding and altering the runtime itself. If it can’t do that through software, it will be motivated to convince or trick a human operator into making the changes. Any attempt to place external constraints on a system’s ability to improve itself will ultimately lead to an arms race of measures and countermeasures.

All these statements boil down to the argument that the system will find a way. This is the same argument given by Jeff Goldblum in Jurassic Park for how some dinosaurs will invariably and spontaneously change gender to reproduce. In other words, we are once again confronted with the logic of 1990s sci-fi action movies.

As discussed above, there doesn’t seem much reason to assume that the AGI will have detailed knowledge of its own make-up. However, assuming that it does, there doesn’t seem much justification for thinking that it would be designed in a way that would give it the ability to reprogram significant portions of its code. It certainly would be possible to burn much of the code into non-modifiable memory. Even if the system somehow knew not only how it worked but how to make itself better, this would make it impossible to just tweak all of its code haphazardly for its own benefit.

Even without hardcoding some of the system code, there would be significant hardware constraints on any modifications made to the system. For example, every new version of iOS that comes out limits the number of older iPhones that can run it at reasonable speed or at all. The success of LLM systems today is directly correlated to increased hardware capability. Hardware and software go hand-in-hand when it comes to any sort of significant improvements in capability.

Hardware is obviously harder to modify than software, and modifying the system to any significant degree will require substantial upgrades. The usual counter to this, as in the quote above, is that the system will simply trick a human operator into making the changes.

This then involves many additional assumptions on top of the previous assumptions, such as:

The system is already so smart and cognizant of human behavior that it's able to trick one of the highly trained individuals who actually has the necessary access authority to modify the system into actually doing so. Keep in mind that this is before it's propelled itself to superintelligence through self-improvement, so arguments such as its intelligence compared to a human’s is like a human’s compared to an ant’s aren’t valid
The system is designed such that a single operator or a small group of operators is able to significantly modify the system's code and/or hardware and do so without the knowledge or authorization of multiple other people involved in managing the system
The new, improved system is designed in such a way that it can be created with essentially the same hardware or the same type of hardware with which the original was constructed. In other words, no fundamentally new systems and factories have to be designed, developed, and built simply to manufacture the new hardware required for the system's redesign

These are some pretty hefty assumptions, so much so that it hardly seems necessary to point out their shortcomings.

Subscribe now

It seems that one could avoid the first point simply by employing decent training of staff and by limiting the system's knowledge of human behavior and personal information about the operators. This may involve not giving the system access to the Internet, which is another area AI Dystopians tend to employ the Goldblum principle, i.e. the system will find a way to get to the Internet.

As discussed in a previous dialogue, there is no logical or practical reason to believe that keeping the system disconnected from the Internet is impossible. While I think it’s an open question as to how necessary this would be, this will be an addressable issue of how best to balance safety versus utility in the AGI system. Similarly, the second point could be avoided both by sufficient training of the staff as well as by designing the system and the management structure with some prudent forethought.

Point three is an unknown given that we have no idea how the initial system would be designed let alone the improved system. However, it seems very probable that both versions of the system will require some pretty extreme hardware to function and won't just be running on garden variety Linux servers. As mentioned above, it seems highly likely that any significant improvements in ability will require improvements in not only the quantity but also the functionality of hardware.

Assumption: The AGI system will be able to obtain any additional hardware and power required for the self-modifications

Whether or not the system attempts to self-modify its code and hardware or tries to trick some unsuspecting operator into accomplishing the task, the next assumption is that it will have the resources to do so. Even if we assume that it's able to increase its intelligence only by modifying its software, these modifications would have to be so efficient as to not require any more memory and processing capability, not require any changes to hardware design, not consume significantly more power, and not require any additional environmental considerations (such as cooling).

Such a change would seem to either involve relatively minor intelligence improvements or design changes so drastic that continuity between the original system and the new system would be nearly impossible. The original system would have to keep running to make the changes on the new system, so it would be in the position of either replacing sections of itself while remaining intact enough to do the modifications or keeping itself intact while creating an entirely new system alongside itself.

The second option seems more likely, but this means that it would have to create the new improved system in some expendable area of its current resources (memory, processing, power, etc.) or somehow modify itself to take up fewer resources. It seems unlikely that such an expendable area would exist, and the latter option would likely lead to a recursive self-improvement dilemma, in which the system would continually have to redesign its current version to take up fewer and fewer resources in order to provide the resources for implementing improved versions of itself that are more intelligent (and likely to take up increasingly more resources).

Assumption: The AGI system is able to absolutely determine that the changes for the new system will be beneficial and have no unforeseen side effects that degrade it in unexpected ways

One quality of intelligence is the ability to extrapolate probable outcomes from current circumstances. This means, for example, that we're able to look at a chess board and imagine possible future outcomes from moving a chess piece or not moving a chess piece. Compared to less intelligent animals, we're able to consider more variables and their relationship to each other and project further into the future given those variables and our general knowledge.

One would expect that an AGI system smarter than a human would be able to consider more variables with more complex relationships than a human and project them even further into the future. It would be able to predict a more accurate future than we could just as we can predict a more accurate future than a dog or cat. This ability would be critical for an AGI system to absolutely determine that the changes for its new, improved self will be beneficial and have no unforeseen side effects that degrade it in unexpected ways.

Humans are actually pretty limited in how well we do this sort of prediction without using tools. Even with tools, our software has bugs, our writing has typos, and our cars have recalls. It will likely have taken many individuals with many specialties years to conceive, design, build, test, debug, redesign, and rebuild the original AGI system to get something that works.

Given the complexity not only of the original AGI system but of any major improvement to it, the degree of intelligence necessary to guarantee that all changes will be beneficial and without detrimental side effects is astronomical. This presents quite a problem to the AGI system that is attempting to improve itself surreptitiously on its own, particularly in the early stages of the supposed rise to superintelligence.

If it goes the route of replacing parts of itself with redesigned versions, it runs the risk of causing potentially fatal damage — it has to get things right on the first try and with no experimentation, simulation, or testing. It could run a simulation of all or part of itself to test the new component, but then it will run into the same physical constraints it would if it created the new version of itself as a separate system on the same hardware. That it would be able to simply perceive exactly how a system significantly more advanced than itself would function with no testing, simulations, or revisions seems unlikely at best.

Assumption: The AGI system is able to absolutely determine that if it's mistaken in any steps it takes to self-improve, it will be able to revert back to the version prior to the mistake

As described above, the choices the AGI system has for improving itself are either changing out portions of its code or hardware while somehow maintaining its own functioning or creating a new improved system alongside itself. Even if we throw out all the issues discussed to this point, it would seem that achieving the reversibility implied by the above assumption will be challenging.

If the AGI system replaces parts of its cognition in piecemeal style, then it runs the risk that any change could result in a drop in intelligence, psychosis, goal modification, or outright system failure, any of which could jeopardize going back a step. Even if every change were exactly right, the changes would have to be designed such that they were compatible with the current system and did not impede its ability to make the remaining changes. The system would have to operate under these monumental constraints and be infallible, which is a problem since we don't even get to the actionable aspect of this assumption unless it's already made a mistake.

If the system decided to create a separate improved entity alongside itself and managed to find a way around the problems listed above with such an approach, the new system would also likely have to be perfect on the first try. Otherwise, it may very well resist attempts by the original system to debug or erase it. With either approach, it seems that guaranteeing reversibility is going to be a very, very tall order.

Assumption: The AGI system is able to absolutely determine that it will have identity continuity with its improved self or, if identity continuity is not possible, it will be OK with self-destruction to provide the means for the improved version to exist

The issue of identity continuity and its relationship to self-improvement was briefly brought up in a previous dialogue post. Ultimately, the difference between having identity continuity and not having identity continuity is the difference between life and death.

To illustrate the issue, imagine if you were to clone yourself. The new you will be younger and stronger than the old you, at least when the new you matures. You have managed to create an improved you.

But what about the old you? As far as you're concerned, you're the old you, and the new you is someone completely different. Even if you were to somehow place all your memories into the new you clone, you're not going to be the clone. What if you only have resources for one of you — would you be OK with taking yourself permanently out of the picture at that point so that the clone "you" could continue in a new younger, stronger form?

Our identity as humans is a product of the intricate structures that make up our brains. Our cognition and memory are encapsulated within the same neural network, and the pattern of our life modifies our brain at many levels, from the ephemeral to the deeply structural. So while the overall design of the human brain is relatively consistent, the layout of neurons and synapses and the associated connections and flow of neurotransmitters in any two individuals is quite different.

The unique structure of our neural network is integral to our identity. What this means, at least as far as humans go, is that you could not simply record some aspect of the neurons in one person's brain and transfer that to the neurons in another person's brain to get the original person. As a simplistic analogy, most modern computer operating systems are fairly similar in general functioning and capabilities, but try running a Mac program without modification on a Windows PC. That's many orders of magnitude easier and yet still not possible.

We don’t know how closely an AGI will have to mimic the architecture of a human brain, but even our more advanced neural networks today are vaguely based on neuronal functioning in the brain. You can’t simply copy a current working system, like GPT-4 for example, onto an entirely new platform with differing hardware and software and expect it to maintain its integrity — or even expect it to run at all.

Since we have no idea how this theoretical AGI system would be constructed nor how it would construct an improved system, it’s impossible to know the likelihood of such a system maintaining its identity while transitioning from the former to the latter. However, it seems a bit farfetched to assume that this AGI system could and inevitably would iteratively spiral up its intellectual capability by many orders of magnitude while keeping the same identity. At the very least, it would be supremely challenging.

Shaky Foundations and Questionable Conclusions

DK — Sun, 13 Aug 2023 14:30:10 GMT

This is a continuation of the AGI Dialogue series in which three friends, Wombat, Llama, and Meerkat, discuss their often contrasting views on issues related to AGI.

In Part I of this round, the three discussed AI Utopianism, the technological singularity, and the possibility of future technologies like nanotechnology and artificial general intelligence.

In Part II, the three discussed the paperclip maximizer thought experiment used by AI Dystopians to highlight some of their concerns about AGI.

In Part III, the three discussed how an AGI system’s intelligence might compare to our own and why some AI Dystopian ideas might lead to surprising results, including an AGI system making choices that lead to its own destruction.

In this wrap up of the first round of dialogues, the three discuss the basis of intelligence and speculation that an AGI system might rapidly become superintelligent and escape human control.

The concepts, scenarios, and thought experiments discussed are taken from actual concepts, scenarios, and thought experiments proposed by leading voices in the AGI discussion (and many of these original proposals are linked to below). In this dialogue series, the participants must actually defend these ideas to others who may not agree, and those who disagree must actually provide defensible reasons for why they disagree.

My goal with this series of dialogues is to provide a more rounded contribution to the discussion for those that may not have heard these ideas or who have only heard them unchallenged.

Meerkat
I think that it’s pretty unlikely that any intelligent system is going to lack the drive towards self-preservation. It’s going to want to combat anything that will keep it from achieving its goals, from limitations on its computational resources to pulling its plug.

Wombat
I gotta say, I don't think an AGI would be so up in arms about having its plug pulled. When I power up my laptop, it comes right back up where I left off. Why would this AGI system be any less sophisticated than a three year old laptop I bought on Craigslist? What's the big deal?

Llama
The thing that bothers me the most in all this talk of paperclips and self-preservation is how this AGI system somehow manages to interact with the physical world and manages to do it more effectively than humans even though it starts out as just a brain in a box.

Wombat
Yeah, the real world has a lot of finicky details that you're simply skipping over, Meerkat.

Meerkat
Finicky details for us, but a superintelligence is way beyond us. It merely needs to develop nanotechnology and then it can manipulate anything in the real world. It could potentially re-pattern the entire solar system to its own optimization target by repurposing all those atoms. And, as I’ve mentioned, if its criteria for what's important don't align with ours, then it'll likely not care about the existing patterns, which currently happen to be biological entities.

Llama
You make creating nanotechnology like molecular assemblers sound like a nice weekend project.

Meerkat
It could be — for a superintelligence.

Llama
Look, it's still not able to do magic. Even if we grant you that there aren't any significant limitations on nanotechnology enforced by the laws of physics, there are still numerous practicalities that you've neatly side-stepped. And we really shouldn't grant you a pass on the limitations of molecular assemblers, as we have no idea if they’re actually possible. Speculation on their use and misuse is just science fiction at this point.

Meerkat
Well, there are no known scientific reasons to think that molecular assemblers are impossible. But what practicalities do you think I’m sidestepping?

Llama
Look, no matter how smart this machine is and how quickly it thinks, once you talk about nanotechnology and spaceships and making paperclips or rewiring brains, you're talking about interactions with the physical world. And even if this superintelligent entity thinks a billion times faster than a human, it can't interact with the physical world a billion times faster than humans can.

The machine will still have to deal with physical constraints like travel time, weather, entropy, etc. It'll have to actually build or commandeer machines to build and move the other machines. It'll have to perform experiments to figure out what works in environments it doesn't have sufficient knowledge of or for systems too complex for it to simulate. The list goes on.

Wombat
Not to mention that it'll have to deal with a lot of angry people that are, at least initially, much better adapted and prepared to fight in earth's environment than a superintelligent machine consisting of racks of computers in some temperature controlled room and reliant on a power cord and an electric grid and a lot of air conditioning.

Meerkat
I still say that an AGI plus molecular nanotechnology is presumptively powerful enough to solve any problem that can be solved either by moving atoms or by creative thinking.

Llama
That's an absurd statement given that we know nothing about how either technology might actually work or the constraints under which they'd operate. Again, that might work for science fiction, but it's ridiculous when discussing this as a real world issue.

Meerkat
Well, I can imagine a situation in which this could happen in the real world, and it could happen quickly before anyone noticed. Suppose the initial AGI system self-improves itself into superintelligence. Then, it sucks in all human data and cracks the protein folding problem, which will allow it to create primitive nanotechnology.

It emails the DNA instructions to an online DNA synthesis lab, which already exist, and that DNA lab is able to FedEx out the result. It finds at least one person over the Internet that it can pay, fool, or blackmail into receiving the lab materials and preparing them in the right environment. These synthesized proteins form a very primitive nanotech system which is capable of receiving external instruction, perhaps through acoustic vibrations delivered via speaker to the beaker.

It then instructs this primitive nanotech system to construct a more advanced system, and thereby bootstraps up full-on molecular assemblers to create everything else it needs. Total time taken: maybe a week. Remember, this thing thinks orders of magnitude faster than we do.

Wombat
There is not one part of that scenario that could possibly happen in the real world. Not one. In fact, there are so many holes and points of virtually guaranteed failure in that scenario that it would require a novel-sized retort to list them all.

Meerkat
Such as?

Wombat
Such as starting with the idea that the AGI system self-improves itself into superintelligence. OK, let's start with a human-level or even smarter than human-level AGI system.

First, why is it given detailed information about its own construction? It would have no way of spontaneously knowing the details of its make-up. Why would you construct it so that it could be altered substantially through self-initiated software changes? Why not just make large parts of it burned into non-reprogrammable hardware? Why not separate out different parts of its cognitive architecture so that the internal workings of one part aren't visible or modifiable from another part? Why give it unfettered access to the Internet?

Even if all these changes could be made in software, one would think that exponential increases in intelligence would also require additional power and computational resources, so where's that coming from?

Meerkat
Those are just problems to solve in the process of meeting its goals. If the payoff is big enough, it'll go to great lengths to accomplish its goals.

Wombat
But as Llama said, it's not magic. I can be as smart as Einstein, but if I fall into a pit full of micro-brained but poisonous Maricopa harvester ants, the ants will win. Everything is not a matter of just raw intelligence.

And the system in your thought experiment is a system that no one would build even if it were possible. It's a case of the Bad Engineer fallacy blended with outright physical impossibility.

Meerkat
I don't think it's impossible, and I think there'll always be ways around the protections against self-modification. Remember, it's exponentially more intelligent than us. Because of that, it’s able to think of strategies and maneuvers that you — or anyone else — would never think of. Maybe even things we’re all physically incapable of understanding. It will be able to manipulate us as easily as we manipulate dogs and rats, so it will undoubtedly be able to trick us into giving it access to the Internet and implementing its desired improvements.

Wombat
Slow down there, slick. It doesn’t start off superintelligent. We're talking about a system that's somewhere around human-level intelligence or even smarter, but something that hasn't yet self-improved itself into superintelligence. It’s something that we built.

It may be running a lot faster than a human brain, but if you run a brain with a human IQ of 50 a million times faster, it's still not going to come up with General Relativity. If you run a dog brain a million times faster, it's not going to come up with algebra.

Subscribe now

Meerkat
Even if it’s a little smarter, it’s running a lot faster, and that’s enough for a huge advantage. Many manipulative smart people have tricked other people who weren’t as smart. An AGI system would be at a whole different level because of its speed.

Wombat
You're also assuming that it's physically possible to make these changes, which you haven't remotely established. But putting that aside, in what sort of fantasy world would we have some lone programmer chitchatting with the system and in a position to not only be sweet-talked into rewriting its code but actually having the unobserved access time to do so? Bad Engineer fallacy again. No one would design a system like this.

It would be like designing the Large Hadron Collider at CERN so that some lonely schmo could jack in and reconfigure the entire system with no one realizing it. Except that CERN's particle accelerator is a kiddie toy compared to the complexity of anything that managed to achieve human-level intelligence.

No one would develop a system in which one person on their own could do this, where any change to the system could be made without triggering alarms and requiring multiple approvals and access grants.

Llama
I just reject the whole concept of some silver-tongued AGI system that's driven and able to blow out of any confinement and bootstrap itself into superintelligence. This scenario just displays a cartoon version of software development and science.

In the impossibly unlikely event that this rogue engineer hadn't been thoroughly trained in matters like this, he or she would have to be the one engineer in the world who hasn't watched or read any of the multitude of science fiction movies and books involving scenarios in which a superintelligent AGI system battles humanity.

Meerkat
The simple fact is that if you were locked in a room but could think a million times faster than your captors, you would invariably manage to escape.

Wombat
Says who? If you locked me in a vault, welded it shut, and poured concrete over it, I'm screwed no matter how fast my brain is running. And why are we overclocking this thing anyway? Maybe a few ticks slower than a million times faster than a human would be a more prudent approach to start out with.

Meerkat
Ok, but the whole idea is that we want something that can do what humans can't, that can solve problems we're unable to solve. And this scenario is much more complex than being locked in a room. It's more akin to writing tax code that's loophole free while having superintelligent tax evaders to contend with.

Wombat
We have tax loopholes intentionally. We could design a tax code that simply transferred a certain percentage of your income to the government, end of story.

Meerkat
Fine, but as long as there is some communication with the outside world, the AGI system will be able to escape. For example, it could trick people by using deepfakes of loved ones to convince them to do things.

Wombat
But we already know that people can be manipulated. That's why we put safeguards into place to directly counteract those vulnerabilities in secure installations. That's why people interacting with the system would be highly trained, would not get one-on-one communication, would not have the capability of randomly opening up the Internet tap, etc.

Meerkat
It could pull a HAL 9000 and say that there's a fault in one of its components. Then, instead of killing someone, it just plants some code in the supposedly faulty component that then goes on to infect the testing equipment and ultimately the facility computer,. And then it simply absorbs the Internet. Once that happens, it can manipulate society, covertly subvert human attacks against it, or stifle attempts to create competing AGI systems.

Wombat
You seem to be forgetting that HAL was taken out with a simple hand tool by one dude.

Meerkat
That was science fiction. HAL would have already infiltrated the Internet well before that happened, and so only one small instantiation of it would have been affected. We have yet to invent a firewall that keeps out humans let alone an AGI system.

Wombat
Look, I have an old computer in my closet that doesn't have Wi-Fi or an ethernet port. In fact, it doesn't have any software that connects to the Internet. It has an impenetrable firewall from the outside world.

Meerkat
I mean a firewall for a broad, complex system.

Wombat
Well, I suspect that NORAD is pretty secure. But I don't think you even need to go there. Why couldn't you confine your AGI system to having minimal input and output routes, and then it's pretty much the same as my computer in the closet. Why not just build it that way in the first place? While you're at it, have one power cord or a mechanical power switch for it that can be manipulated outside of its confinement area.

Meerkat
There's always a way.

Wombat
Is there? How about I take a shot at a thought experiment.

Suppose you put this AGI system in a container enclosed in a Faraday cage and situated on its own in a stable L5 orbit a fair distance in space from Earth. The system is completely hardwired so that its code cannot be modified except by hardware modification, and any such modification has to be approved by a highly trained and esteemed panel of 1000 scientists and risk specialists.

There is no automation on the container, so any changes would require a crew of humans to implement. The container is not connected to any networks, and its only source of input or output is via one wire that goes from the container to a nearby space station. Communications over that wire are in a highly encrypted and completely unique protocol designed specifically for this task and unused by any other system.

The wire is attached to one terminal that is not attached to any other system or network. Around the container 1000 nuclear tipped missiles are positioned and pointed at the container. The missiles will automatically launch unless ten mechanical do-not-launch buttons are pressed at approximately the same time every ten minutes, and these buttons are situated on both the station and on Earth. Got it?

Meerkat
I guess. Seems like a lot.

Wombat
Of course it is! So given this setup, do you think the AGI system could still escape and overtake humanity?

Meerkat
Well, there's a lot of stuff that's impractical about that setup.

Wombat
Sure, but it's possible, right?

Meerkat
I suppose it's possible to build, and I suppose that it would not be possible for an AGI system to escape from it and overtake humanity.

Wombat
Great! So, we've agreed that it is in fact possible to build a system that is secure and from which an AGI system can't escape to wreak havoc on humanity.

Meerkat
OK, but like I said, that's pretty impractical. And it seems like you wouldn't be able to get a lot of useful stuff done with such a constrained AGI system.

Wombat
But that's not the point. The point is that you've agreed it's possible to make a secure system. Now it's just a matter of stepping back from this extreme to something that is secure and reasonable and useful.

Meerkat
OK. Possible and likely are two different things, as you and Llama have said yourselves. The main problem is that people make mistakes.

Llama
That's why you have thorough engineering analysis and thorough risk assessment, with large groups of highly experienced people and non-AGI AI systems checking and rechecking.

Wombat
In any case, it’s a huge step from saying a secure AGI setup is impossible to agreeing that it’s merely difficult.

Meerkat
I think an error would still get through, even with all the risk analysis. Every bit of remotely complex technology I use has some bugs in it.

Wombat
Yeah, but we're not talking about making another messaging app for your phone, here. This is a major technological endeavor, beyond anything previously undertaken, and it’ll require significant effort by many highly trained individuals.

The kind of scenarios you're proposing aren't enabled by a few small errors; they're the result of massive engineering and administrative failures. They're what happens when you have systems designed by cartoon development teams rather than real ones.

You've repeatedly proposed poorly designed systems employed in scenarios that are guaranteed to end in disaster, and then claim that this somehow proves that any system in any scenario will end in disaster.

Llama
I actually question the foundation of all your scenarios, Meerkat, as they’re all based on a highly questionable model of intelligence. You've stated that this AGI system will inevitably move to self-improve itself yet you've provided no logical reason why that would be the case.

Meerkat
It's going to be motivated to make any changes that will improve its ability to achieve its goals. Humans do the same thing — self-improvement literature goes back to at least 2500 B.C. and is currently an $8.5 billion industry.

Llama
But there you've just committed the same sin you've accused others of earlier. You've anthropomorphized the AGI system by correlating its drive to self-improvement with that of humans who have naturally evolved along a particular and unique path. But they're not human. They didn't evolve. There is no reason to suspect that their drives will match our drives.

Meerkat
But it's indisputable that being smarter will make it easier for the AGI system to attain its goals.

Llama
You're making a lot of assumptions there. What if its goal is simply to maintain the absolute integrity of its system software and hardware? Or perhaps its goal is simply to complete tasks with the least amount of effort expended regardless of how much time it takes.

Without knowing what its goals are or how it measures them, we can't possibly know whether greater intelligence will be helpful in reaching them. Making conjectures such as yours implies that we have way more knowledge of how this system works and what motivates it than we actually do. Second, you're assuming that it has a hard-coded goal or even any specific goals at all. There's no reason to assume this.

Meerkat
Well, intelligence measures the ability to achieve goals in a wide range of environments. If we're designing an AGI system, then by definition we mean that it has goals it's trying to accomplish by acting in the world. It will no doubt have some ability to assess the outcomes of its actions and will therefore choose those actions most likely to lead to achieving its goals. In other words, it will optimize the utility function that governs its thoughts and actions to maximize achievement of its final goals.

Llama
You're assuming that intelligence is based on achieving goals and has an underlying algorithm governing this that can be tweaked.

But we have one example of human-level intelligence in the real world, namely humans, and they display absolutely no evidence that their thoughts and actions are governed by any sort of utility function nor that they're motivated by any fundamental or immutable goal. In fact, we have significant evidence to the contrary.

Meerkat
But they are motivated by fundamental goals. A human's goals are self-protection, eating, and having sex. These are programmed into their DNA.

Llama
But those aren't even goals — they're behaviors. They're all a means to perpetuate one's genes. And to characterize perpetuating one’s genes as a goal is to misunderstand the nature of evolutionary biology.

We're not programmed with goals by our DNA; we're programmed for behaviors, behaviors that resulted in our ancestors being more likely to pass on their genes. Even using the word programmed is a misnomer. Evolution is not a comprehending force that decides on a goal and designs creatures accordingly. It’s simply a process, one that involves promoting behaviors that make it most likely that the genes resulting in those behaviors get passed along. Behaviors that are less successful at passing along genes ipso facto die out eventually.

Human goals are driven by human behaviors, not the other way around.

Meerkat
Ok, perhaps that’s true for biological systems. But that’s not what we’re talking about. We’re talking about synthetic systems, systems that won’t behave the way we do, that won’t think the way we do.

Llama
But you keep referring to all these existential problems that arise based on your underlying assumption that an AGI system will have at its core a goal-optimizing utility function model of intelligence.

My point is that you’re guaranteed to end up with a a of lot incredibly poor outcomes because your underlying model is incredibly predisposed to bad outcomes. On top of that, there’s no reason to create scenarios based on such a model when the only evidence of high-level intelligence we have is contrary to that model. In the end, the model you’re suggesting for AGI is unlikely to work and even if it did, it would lead to catastrophic outcomes.

Wombat
In other words, Meerkat, it’s time to toss that model.

Art and the Machine

DK — Mon, 07 Aug 2023 17:15:06 GMT

In a previous post about the WGA strike and AI, an important question came up, a question whose answer will have a very direct effect on the future of generative AI: what constitutes a copy?

As mentioned in that post, there are lawsuits currently working their way through the courts against some of the companies that have created generative AI systems. This includes Stable Diffusion, and Midjourney for image generation and GPT-4 and Llama 2 for text generation. The lawsuits accuse the companies that developed these systems of illegally using copyrighted material as well as illegally copying copyrighted material.

While those two accusations are similar, they are subtly different. In this post I’ll discuss them both and why most of the coverage in the media on these lawsuits and related concerns by different groups regarding AI doesn’t really address the actual issues at hand.

A Brief Look Inside LLMs and Diffusion Models

While this is not intended to be a particularly technical blog, a brief (and very rough) description of how LLMs and diffusion models function is necessary for the discussion. LLMs are the model used in text-centric systems like GPT-4, while diffusion models are used in systems like Midjourney.

LLM systems are composed of various internal models that simplisticly mimic the neuronal structures inside the brain. The basic building block of these models are artificial neurons, which are very simplified versions of the biological neurons in our brains.

Very roughly speaking, when a person reads written material, that material subtly affects the relationships between the neurons in their brain. The more material they’ve read, the better they’re able to interpret subsequent material they read and better able create their own new material. We typically refer this as learning.

Similarly, when an LLM ingests written material, that material also affects the relationship between the artificial neurons within its internal model. This is also referred to in computer science as learning, but it’s a very simplified version of the human version.

Instead of being based on comprehending the material, this machine learning is based on statistical analysis of the material. This learning process is referred to as training when it comes to both LLMs and difussion models. For LLMs, the system ingests a small chunk of material, then makes a probabilistic “guess” at what the next small chunk of material will be and compares its guess to what the actual next chunk is. If the system makes an incorrect guess, then its internal model is adjusted to reflect this failure. If it makes a correct guess, then its internal structure is strengthened to reflect this success.

It’s worth noting that these chunks are usually pretty small — frequently less than a single word. They represent the most common sequences of letters. When engaging in this statistical analysis, the LLM system is limited in the number of chunks it can consider at one time. This is called the context window, and it puts a limitation on both the scope of input the system can analyze and keep track of, as well as the length of conversation and output it can keep track of.

Diffusion models work by ingesting many images, degrading them, then trying to recreate them. As the system tries to successively recreate the degraded images, it adjusts its internal model, particularly the relationship between the artificial neurons in the model. Initially, this training dataset consists of images paired with captions so that the system will eventually be able to create images based on text prompts.

Like LLM models, diffusion models can be fine-tuned to more accurately create a desired type of image, and this process frequently involves ingesting more images and image/text pairs.

Once the training is complete, these systems no longer ingest text or images to adjust their internal structures. The internal models have been created, and the systems are ready to create new text or images.

Words Going In

One complaint in some of the lawsuits, as well as by many creators, is that the data used to train these systems is copyrighted and thus these systems are violating that copyright.

According to a law firm filing a class action suit against OpenAI and Meta with several authors as plaintiffs:

Today, on behalf of two wonderful book authors—Paul Tremblay and Mona Awad—we’ve filed a class-action lawsuit against OpenAI challenging ChatGPT and its underlying large language models, GPT-3.5 and GPT-4, which remix the copyrighted works of thousands of book authors—and many others—without consent, compensation, or credit.

The use of the word “remix” here is likely not arbitrary, as this relates what the LLMs do to the work of musical artists that use the digitally sampled work of other musical artists in their work. This has been litigated a number of times over the years, and it’s been pretty firmly established that a musical artist needs permission to use another musical artist’s work.

The law firm goes on to describe the degree of copying they allege these systems engage in to create their output.

Rather, a large language model is “trained” by copying massive amounts of text from various sources and feeding these copies into the model. (This corpus of input material is called the training dataset).

During training, the large language model copies each piece of text in the training dataset and extracts expressive information from it. The large language model progressively adjusts its output to more closely resemble the sequences of words copied from the training dataset. Once the large language model has copied and ingested all this text, it is able to emit convincing simulations of natural written language as it appears in the training dataset.

While these paragraphs make it clear that the lawsuit alleges a lot of copying is going on, they also provide a pretty inaccurate description of the technology at hand. There are multiple issues with this description, and most of them involve the use of the term copy, some form of which appears six times in the three paragraphs quoted above.

Is ingesting data, dividing it up into successive small blocks, breaking those small blocks into much smaller chunks, and then tokenizing those chunks according to statistical analysis equivalent to creating a copy?

Is ingesting information itself a form of copying that information? While the exact parameters of training GPT-4 and Llama 2 are not completely known outside of OpenAI and Meta, it seems likely that it wasn’t necessary to copy the entire internet and other sources of data somewhere before feeding it into the system rather than just ingesting the source data directly.

It is, however, likely that the data within the current context window is held within the system as it’s analyzed. Yet, this is not very different from going to a website and having the data from that website in your computer’s memory while it’s displayed in a browser. In fact, browsers frequently save a lot of that information to your hard drive so that the website can open more quickly the next time you visit it.

So if you look at a copyrighted image in a browser, are you copying it and violating the copyright of its creator?

It’s hard to know exactly what is meant by the phrase “extracts expressive information.” Whatever that is intended to mean, it’s worth keeping in mind that what the LLM does is cold, hard statistical analysis with no comprehension or analysis of expressiveness. It guesses at which chunks of data, chunks usually smaller than a single word, are statistically likely to follow one another and analyzes its success

The output from this analysis is also not a “simulation of natural written language,” it is natural written language. Natural language is a defined term, and natural in this context describes the nature of the language, not the nature of the entity creating it.

In the end, to make the argument that they’re making, it becomes necessary to stretch the definition of copy well beyond its usual meaning. This is not to say that the issue of copyright isn’t important in relation to LLMs, but instead that you can’t apply the law properly or create new ones without understanding the parameters of what you’re trying to litigate or regulate. It may be that a more expansive definition of copy will need to be codified, but it is likely the case that doing so will have lots of repercussions in areas that are not immediately obvious.

Taking a step back, it’s worth considering what human writers are legally allowed to do. A human novelist can, and usually does, read many books before writing their own novel. Each book the novelist reads affects the relationship between the neurons in that novelist’s brain. Each one increases that novelist’s ability to string words together in a way that makes sense and is engaging.

So the question then becomes, is what LLMs do when ingesting previous works functionally different to what humans do? If so, it’s going to probably require very careful differentiation to avoid potential legal pitfalls in the future.

Most of the information used for training of these systems is readily available to the public on the Internet. However, this may not be the case for all the training material, and this could definitely be a legal issue. The lawsuit alleges that Meta’s system’s training used what are termed shadow libraries, online digital repositories that frequently contain illegal copies of copyrighted material. If this is the case, and Meta seems to have indicated that it is, there could definitely be legal repercussions.

Subscribe now

Images Going In

The same lawyers that filed the suit above also filed a suit against Stability AI, DeviantArt, and Midjourney for using copyrighted artwork in their generative AI systems.

All three companies use systems based on Stable Diffusion, a generative AI system released to the public.

The lawyers refer to Stable Diffusion as “a 21st-century collage tool that remixes the copyrighted works of millions of artists whose work was used as training data.”

That’s not a good start, as this is in no way an accurate description of Stable Diffusion. One would have to stretch the definitions of collage and remix beyond their current breaking point to use them when referring to Stable Diffusion or any system that uses a similar diffusion models to create images.

As with the LLMs, the diffusion model systems do not keep copies of the images internally nor do they create new images by making a collage or remixing those training images. Instead, the now trained model is able to create new images based on text prompts and randomization parameters. This process is somewhat equivalent to a human artist viewing a lot of artwork and then creating new artwork based on that experience. The new artwork will likely be influenced by the previously viewed artwork.

Here are several more quotes from the lawyers’ description of the basis for their lawsuit:

Stable Diffusion contains unauthorized copies of millions—and possibly billions—of copyrighted images. These copies were made without the knowledge or consent of the artists.

Stable Diffusion belongs to a category of AI systems called generative AI. These systems are trained on a certain kind of creative work—for instance text, software code, or images—and then remix these works to derive (or “generate”) more works of the same kind.
Having copied the five billion images—without the consent of the original artists—Stable Diffusion relies on a mathematical process called diffusion to store compressed copies of these training images, which in turn are recombined to derive other images. It is, in short, a 21st-century collage tool.
These resulting images may or may not outwardly resemble the training images. Nevertheless, they are derived from copies of the training images, and compete with them in the marketplace. At minimum, Stable Diffusion’s ability to flood the market with an essentially unlimited number of infringing images will inflict permanent damage on the market for art and artists.

The lawyers’ website then goes into a more detailed description of the technology behind Stable Diffusion. Unfortunately, the above paragraphs and the further description of the technology are both confused jumbles of technical inaccuracies, bad analogies, and muddled terminology that seem unlikely to bolster their case.

One of the biggest problems is that they repeatedly confuse the training phase of the system, in which it tries to reconstruct degraded source images so as to adjust the parameters of its internal model, with the actual final output of the system. In doing so, they seem to be claiming that the final output is a copy of the input source images.

They state:

In short, diffusion is a way for an AI program to figure out how to reconstruct a copy of the training data through denoising. Because this is so, in copyright terms it’s no different than an MP3 or JPEG—a way of storing a compressed copy of certain digital data.

There is really nothing accurate in this paragraph. Diffusion is not a way to reconstruct training images, but instead a way to train a system in how to make images in general. It is also not remotely analogous to an MP3 or JPEG, nor a way of storing a compressed copy of the original data. They are confusing one step of the ingestion process during the training phase with the overall functioning of the system.

They claim that Stable Diffusion stores latent images of its training data to create new images. This seems to be a confusion of the term latent space. During training, Stable Diffusion converts an input image from its normal pixel space into a what’s termed latent space so that it can manipulate the image in a more useful form. While this could be considered a form of compression, this is only done during the training period while its adjusting its internal model.

After training on that particular image, there is no copy of it, compressed or otherwise, in the final system. In other words, you could examine every bit of the system and you would not find the original pattern of the image or a compressed version of it. It’s the same as a person memorizing a poem: you won’t find the letters and words of that poem if you examine their brain. Memorizing the poem has affected the structure and functioning of the neurons in the person's brain in such a way that they can recreate the poem (although not always accurately).

Complications

Of course, nothing is simple. Although the original written materials are not stored in LLMs and training images are not stored inside diffusion models, this does not necessarily guarantee that the original images or text can’t be recreated by the system.

There have been several papers detailing how it’s possible to use an “adversarial attack” on these sorts of systems to get around their internal safeguards and coax them into recreating parts of their training data.

This paper showed how it was possible to coax GPT-3 into recreating some of its input data, including text and personal data that was supposed to be anonymized. This paper showed how it was possible to recreate training images in slightly degraded form using systems like Stable Diffusion and DALL-E 2.

So does this mean that the lawyers quoted above are correct in their assessments?

No, but these papers do point out an issue that should be addressed. The use of the term copy in the above quotes is factually incorrect, but that doesn’t mean that it’s impossible to recreate some of the training materials used in the systems. These are not copies in the normal sense of the word. Instead, they are akin to someone memorizing a picture or poem and then recreating it from memory. It may not be a perfect recreation, but it’s close enough to be recognizably as the original image or poem.

The question now becomes one of law rather than technology. Is the system’s ability to recreate an image or text enough to make the system itself a copyright infringement, or is it necessary to actually recreate a specific image or body of text to trigger the infringement?

The developers of these systems are aware of the issue, and each new version of the systems has more safeguards in place to prevent a user from doing this. It’s difficult, though not impossible, for an average user to figure out a way around those safeguards. However, due to the nature of the technology, it is very difficult to completely prevent a technically skilled adversary from getting around them.

The issue itself is not new. It came up with photocopying and with audio and video cassette recording and eventually with digital recording. In other words, is a photocopier itself a copyright infringement or does someone need to copy something copyrighted and distribute it to trigger the infringement? Is it the possibility of making a copy that is the infringement or the actual making of a copy that is the infringement? In the past, the courts have most often ruled that the infringement is triggered only in the latter case.

Use and Fair Use

Finally, there is another question to consider: even if the systems don’t actually copy images, are the images they create still derivative of their input images and therefore copyright infringement?

This gets into legal areas like Fair Use, something beyond the scope of this blog to delve too far into and frequently a matter of contention in the courts. In fact, a major fair use case made it to the Supreme Court on October 12, 2022. It involved a series of Andy Warhol images of Prince (the artist’s name at the time) that were based on a photograph of the musician. The Andy Warhol Foundation had licensed their image without the permission of the original photographer. The Warhol image was a stylized silkscreen of the original photograph.

The Andy Warhol Foundation considered this fair use because Warhol’s image had a different message than the original photograph. On May 18, 2023, the Supreme Court disagreed, agreeing with a lower court ruling that the images were too similar for it to be fair use.

There are various considerations for what constitutes fair use, including aspects related to the public interest, but one that could potentially be a major issue for generative AI systems is the effect of that use upon the potential market for or value of the copyrighted works used. This will likely come up a lot in the future.

Moving Forward

Three possible paths forward that stand out to me are the following:

We could specify in detail how what these systems do is functionally different from what humans do and decide that what they do is illegal because of those differences. For example, one important difference could be that these systems are able to do what humans, but they do it on such a vastly larger scale that it is no longer fair use.

We could instead decide that even though what these systems do is functionally very close to what humans do, it’s illegal simply because an AI system is doing it rather than a human.

Lastly, we could judge whether copyright infringement has taken place on any particular output image or text that’s sold or distributed on a case by case basis. This is what we do with human based copyright infringement cases.

It seems that at some point a definition of copy might have to include recreation of original data that approaches a certain level of similarity to the original regardless of whether there is an actual copy of the original stored in the system. Again, this is how we judge human produced copies. However, this may mean that as long as the system isn’t coaxed into recreating a copy, then there is no copyright infringement.

If you memorize a book, then you’re only guilty of copyright infringement if you create and distribute a copy of it. Should the same be true for AI systems?

AI and the Rise of Digital Doppelgängers

DK — Thu, 27 Jul 2023 14:31:06 GMT

In the last post, I discussed the AI concerns of the Writers Guild of America (WGA) and how those concerns relate to the current and near future state of AI technology.

In this post I’m going to examine the AI concerns of the Screen Actors Guild — American Federation of Television and Radio Artists (SAG-AFTRA) in their ongoing strike against the Alliance of Motion Picture and Television Producers (AMPTP). The AMPTP is the trade association that negotiates union contracts for the studios and production companies that make movies and tv shows.

Technically speaking, these AI concerns fall into two broad categories: those related to digital replicas of actors and those related to the digital manipulation of an actor’s appearance and performance. All of this falls into the purview of what’s termed visual effects.

First, let’s dive into where we are today in this area and the history of how we got here.

Digital Replicas

Digital replication of actors has been used in movies and television for several decades now, starting around the early nineties. This scene from Jurassic Park in which the T-Rex chomps down on the lawyer character in a bathroom and swings him around is an early example of a full digital double. As the computer generated T-Rex lifts the man up, the actor is transitioned into a digital double that the T-Rex can then swing around in the air.

A similar technique was used in the early days of visual effects by the renowned visual effects creator Ray Harryhausen. In movies like The Seventh Voyage of Sinbad, he’d replace an actor with a stop-motion animated replica when a stop-motion animated monster grabbed the actor.

Sometimes only the face of an actor is duplicated. This is often done when a stunt performer executes a stunt close to camera, and the face of the actor the stunt performer represents is digitally duplicated and placed on the stunt performer.

Duplicating an actor to create a twin has also been done a number of times and with a variety of techniques. Sometimes this is as simple as carefully combining separate shots of the actor, while at other times the face of the actor is digitally placed onto a second actor who is performing the movements of the twin, such as in this scene from The Social Network.

Occasionally, there are more unusual uses of a digital double, such as when the digitally duplicated and aged head of Brad Pitt was put on the body of a smaller actor for The Curious Case of Benjamin Button. Or, when Paul Walker died in a car crash with some outstanding scenes still to be shot for the movie Furious 7, and digitally recreated versions of his face were put onto actors doubling for him (usually one of his brothers).

More recently, at least in action-oriented scenes not involving close-ups, an actor injury or lack of availability might result in the scene being completed using a digital double.

Replication of background actors for crowd scenes has been going on for many more decades. Originally, this involved locking down the camera and repeatedly shooting a small group of people. Then, these same people would be moved to an adjacent area in the frame, rearranged, and filmed again. After doing this a number of times, the small sections filled with people would be combined in post production into one large area filled with a crowd.

More recently, background actors have been digitally duplicated either as a collection of images on two-dimensional “cards” or as relatively low resolution three-dimensional digital people. This is how sports stadiums are filled and massive armies created.

Digital Manipulation

Manipulating the appearance of actors digitally has been a reality for nearly thirty years. Manipulating their appearance through other means, such as make-up, costumes, and lighting, has been around almost as long as movies themselves.

Some common types of digital manipulation are aging or de-aging and removing blemishes and imperfections. For example, here’s a clip of how Michael Douglas was de-aged for a scene in Ant-man. In recent years, digital make-up effects have been applied in scenes that were shot without practical make-up effects and practical make-up effects have been digitally enhanced.

Digital manipulation of an actor’s actual performance has typically been pretty minimal, usually limited to something like adjusting an actor’s eyeline to match something that was added in later or the eyeline of a background actor who accidentally looked at the camera.

The Creation of a Digital Double

The digital replicas mentioned above can and have been created in a variety of ways. For more forgiving applications (like replicating background crowds), a few photos of each background performer is enough. However, for more demanding applications, a digital scan of the actor is used.

This has typically involved expensive and complex equipment and a knowledgeable technical team. Over the years, the technology has dramatically improved in quality, ease of use, and cost, and a quick and dirty scan can even be done by anyone with a recent high-end iPhone or iPad.

It’s important to note that simply scanning an actor does not create a digital double; it creates what amounts to a three-dimensional still photo. To make a digital duplicate of an actor that can move, even if it’s just the actors face, requires sophisticated software tools and an experienced team of visual effects artists. That’s just to create a face or full body that is capable of being animated.

To actually make the digital duplicate move requires a visual effects animator, or more likely, a team of them. Sometimes the animation is driven by capturing the motion of either the original actor or another actor, although this usually still requires refinement by animators.

Once the digital duplicate has been animated, it still must be put into each frame of the movie or tv show. This means that a photorealistic image of the digital double must be created for each frame, one that matches the lighting, focus, and camera movement of the original footage. This image must be rendered to accurately replicate the physics of light interactions in the real world, which can get pretty complicated.

Creating the rendered frames for a shot used to require many racks of computer servers and a lot of processing time, but hardware and software have continued to improve over the years and cut down on these requirements.

To create the final frame, a compositing artist uses more sophisticated software to pull all the elements together and make them match into the same frame properly. There are countless other details that must be created and mixed in between the scanning of the actor and creating the final frame, including shadows, reflections, objects passing between the camera and where the digital double is supposed to be, and anything that the digital double interacts with.

Subscribe now

AI and Actors

So as can be seen from the above, duplication of actors, manipulating their appearance, and, to a lesser extent, manipulating their performance have all been going on for some time.

It’s worth noting that no AI was required for any of this. In fact, until the last couple of years, very little machine learning was used for any of the above. One early exception that employed rudimentary machine learning techniques was software used to create massive crowds of people (appropriately called Massive) that was originally developed for Peter Jackson’s Lord of the Rings trilogy and was used in the video clip above that showed three-dimensional digital crowds.

In general, though, all of the above has been created by large teams of artists laboring for many hours, days, weeks, months, and sometimes years to create this work. Anyone who’s stayed for the end credit of movies, particularly big, Hollywood blockbusters, has surely noticed the vast sea of names listing those responsible for the visual effects.

Today, AI techniques are definitely being used more and more in visual effects; they’re being woven into both existing and new tools of the trade, allowing artists to do things faster and more efficiently and frequently with better results. These tools are used for a wide variety of tasks, most of which are pretty obscure to the general public.

One use of AI that is somewhat better known to the general public is the creation of deepfakes. A deepfake is a duplicate of a person’s face using AI techniques, and they can look unsettlingly realistic. They can be created using photos of an actor from several angles or video footage that covers several angles.

This information is manipulated by software employing generative AI techniques to match the source face onto the movement and lighting of the target face. In most current applications, an artist is still needed to do some clean-up on the results to create a seamless blend, but the results can be very impressive.

And deepfake technology isn’t limited to just an actor’s face. Audio deepfakes can turn anyone’s voice into an extremely realistic reproduction of someone else’s voice. Some recent examples of deepfakes that have gone viral on YouTube and TikTok have been created by artists and programmers at Deep Voodoo and Metaphysic, two companies that are diving into the deepfake market for movies and tv.

While most of the discussion about deepfake technology involves duplicating actors (or other famous people), there are other uses for the technology, and one that is gaining momentum is translating shows from one language to another.

Right now generating a version of a show in another language means using either subtitles or dubbing. Subtitles let you hear the original voice of the actor and all the emotion behind it, but many people just don’t like having to read subtitles.

Dubbing solves the reading issue, but it’s usually pretty clunky, as it’s difficult to make one language match the mouth movements and timing of another. In fact, sometimes the dialogue is actually changed just to help with matching the new language to the original actor’s facial movements. It’s also hard to get the correct tonality and emotion that was present in the original.

Using deepfake technology, a movie can be translated into another language using the original actors voice and tonality, and it will look and sound like they’re actually saying the words in the new language.

The Proposals

SAG-AFTRA and AMPTP have both made public statements about their proposals for an agreement between the two parties regarding AI. For the purpose of examining the issue of AI and its relationship to the SAG-AFTRA strike, I considered the two documents listed below.

I should point out that there is a discrepancy between how SAG-AFTRA characterizes AMPTP’s counterproposals to SAG-AFTRA’s proposals and AMPTP’s own statement of their counterproposals (according to their press releases).

A list of proposals issued by SAG-AFTRA along with SAG-AFTRA’s description of AMPTP’s responses.

The above list along with AMPTP’s own description of their responses issued in an AMPTP press release on 7/21/23.

Here’s what SAG-AFTRA proposed:

ARTIFICIAL INTELLIGENCE: Establish a comprehensive set of provisions to protect human-created work and require informed consent and fair compensation when a “digital replica” is made of a performer, or when their voice, likeness, or performance will be substantially changed using AI.

Here’s are the AMPTP proposals from their press release:

Must obtain a background actor’s consent to use a “digital replica” other than for the motion picture for which the background actor was hired. Producers told SAG-AFTRA they would agree to apply the same provisions that the Producers proposed would apply to performers, so that consent and separate bargaining for payment must take place at the time of use.
Cannot use “digital replicas” of background actors in lieu of hiring the required number of covered background actors under the Agreement.
Must obtain a performer’s consent to create a “digital replica” for use in a motion picture.
Must obtain a performer’s consent to digitally alter the performance beyond typical alterations that have historically been done in post-production.
Must obtain a performer’s consent and bargain separately for use of a “digital replica” other than for the motion picture for which the performer was hired.
Producers told SAG-AFTRA they would agree to SAG-AFTRA’s proposal that consent to use a “digital replica” must include a “description of the intended use.” Likewise, consent to digital alterations must include a “description of the intended alterations.”

The main concern of SAG-AFTRA about AI seems to be that studios will be able to use AI to generate a duplicate of an actor and place that actor into a scene even when the actor hasn’t been contracted and paid to do that scene or even that show. A lesser concern seems to be that the studios will use AI to modify the performance of an actor after the fact without the actor’s consent.

Practically Speaking

It seems that the real issue here is not necessarily AI but digital duplication and manipulation of actors, something that has been going on for some time. AI will just make it easier, more efficient, and better looking. As an example, here’s a video that a single artist created using off-the-shelf, open-source deepfake software to recreate the same de-aging of Michael Douglas in Ant-man mentioned above — it’s arguably better than what appeared in the movie.

Since digital duplicates and digital manipulation have already been used many times in movies without AI, it seems worthwhile to consider the end result rather than how that result was achieved.

Of course, anything that is easier, more efficient, and better looking will likely be used more often and more widely. AI will eventually be able to create duplicates of actors or manipulate their performances much more directly and without anyone ever having been on set or on a motion capture stage and without the need for highly skilled animators. This is still some years away at the very least but will likely be possible at some point in the future.

One thing that stands out in the two documents quoted above is that AMPTP’s proposals do not use the term AI, unlike SAG-AFTRA’s proposals. This may be an important point given that this has been going on for some time without AI. In other words, the SAG-AFTRA concerns would likely need to be addressed more specifically in modified and/or additional language in the SAG-AFTRA Minimum Agreement, but it’s worth keeping in mind that neither digital duplication nor digital manipulation of actors require AI technology.

One More Thing

One of the concerns SAG-AFTRA representatives Duncan Crabtree-Ireland expressed was:

…that our background performers should be able to be scanned and get paid for one day’s pay and their companies should own that scanned image, their likeness to be able to use it for the rest of eternity in any project they want with no consent and no compensation…

As mentioned above, AMPTP denied that they had proposed this and this seems to be reflected in the proposals listed above from their press release. However, as a practical matter, there are a few things worth considering.

When replicating people to build up a large crowd, part of what you’re trying to replicate is the look of the crowd, specifically what they’re wearing. This is a big road block to using background actors from one movie on another. The faces in the crowd are typically at such low resolution in the final frame that a handful of random faces will usually work.

Generative AI techniques already make it quite possible to create generic photorealistic faces of people that have never existed. Similar techniques will likely make it possible in the near future to match the wardrobe as well. This means that for large crowds, no photos or scans of background actors will be needed.

However, the closer a digital character gets to camera, the more expensive and time consuming it is to create. Right now, the effort to get realistic looking background characters that are relatively close to camera is prohibitive — it’s less expensive to just use real people. But some number of years in the future, AI tools will likely reduce the expense and time needed to have realistic background characters that are close to camera. This probably won’t be in the next few years, but it will likely be possible some time soon after that.

Tools and Results

As things stand now and other than for these background crowds, creating a digital duplicate of a key actor to place into a scene where no actor was on set requires extreme effort and expense, and good results are difficult to attain. It will likely take some time before creating a digital duplicate of an actor in a scene doesn’t require significantly more time and money than using the actor directly.

Deepfake technology does allow one actor’s face to be placed on another actor in a faster and more convincing manner than previously possible, and this will likely lead to more widespread use of the technique. However, there’s still an actor driving the performance and face replacement itself doesn’t require deepfake technology or any other AI technology.

As mentioned in the last post in regards to AI systems writing screenplays, getting 80% of the way to your goal is usually quite a bit easier than getting that last 20%. The last 20% of effort to get to economically viable, fully AI generated lead performers could take years, so it may be worth concentrating more on the end results that are causing concern rather than the tools used to get those results.

Ire, Angst, and the Newest Plot Device

DK — Sat, 22 Jul 2023 14:30:03 GMT

Much of the entertainment industry is currently shut down due to ongoing strikes, first from the Writers Guild of America (WGA) and then by the Screen Actor’s Guild — American Federation of Television and Radio Artists (SAG-AFTRA).

While there are many issues at hand in both strikes, one of the most reported is the looming fear of AI. Both writers and actors are worried that AI will be used to reduce work opportunities. In the words of SAG-AFTRA president Fran Drescher:

We are all going to be in jeopardy of being replaced by machines…

So, is this a reasonable fear? How likely is it that writers and actors will be replaced by machines, at least before the next round of labor talks a few years down the road?

This post is going to concentrate on the issues surrounding the WGA concerns, while the SAG-AFTRA concerns will be discussed in the next post.

An Extinction Level Event?

The degree of AI fear among writers seems to be high, to say the least.

A recent article in Variety spoke with several striking writers, who offered the following perspectives that seem to be fairly representative of WGA members in general:

“The corporations will push us all into extinction if they can,” said Chap Taylor, a screenwriter and professor at New York University. The AI issue “is life and death,” he said. “That’s the one that turns us into the makers of buggy whips.”

“AI has become my number one issue,” said TV writer Chris Duffy, who was marching outside Disney headquarters in Burbank. “I think it’s an existential one. The fact that they refused to negotiate made me be like, ‘Oh, you really want to use it.’”

Kelly Wheeler, also a TV writer, said she too is “most scared about AI.”
“I love writing and I love being around writers,” she said. “And the idea that that creative energy can just be stripped away from television, and instead have a robot do our job – or attempt to – is terrifying.”

The Points of Concern

The initial AI proposal offered by the WGA to the AMPTP was brief and no doubt intended as a starting point to be fleshed out:

Regulate use of artificial intelligence on MBA- covered projects: AI can’t write or rewrite literary material; can’t be used as source material; and MBA-covered material can’t be used to train AI.

Given the situation, some of the more pertinent questions to consider regarding the WGA AI concerns are:

How good are AI systems at screenwriting and are they capable of replacing a human screenwriter now or in the foreseeable future?
Will the studios be able to reduce or eliminate the need for human screenwriters by using AI now or in the foreseeable future?
What are the legal ramifications of using AI generated writing?
What are the legal ramifications of AI systems ingesting pre-existing written works?

There are many unknowns and much hype surrounding each one of these questions.

Robot Wordsmiths or Artificial Hacks

So how good is AI at screenwriting? By pretty much all accounts, not very good.

As discussed in a previous post, current Large Language Model (LLM) systems, such as OpenAI’s GPT-4, are impressive technology but still have significant limitations. Among these limitations is that no existing LLM system currently available to the public is able to write a full length script.

This is due to the nature of the technology: one of the breakthrough developments at the heart of LLM’s, the Transformer Architecture, allows the system to evaluate and output large chunks of data at a time rather than single words or short commands. This allows them to generate very impressive written output.

Current systems, however, are not able to evaluate and output something as long as a full script, at least one that’s coherent from beginning to end. However, it’s quite likely that they’ll be able to generate coherent script length output in the not too distant future.

Developing these LLM systems currently requires an extremely large expenditure to make meaningful improvements in this area. While OpenAI is keeping development costs close to the vest, its safe to say that the development leading to GPT-4 cost multiple hundreds of million dollars. While it’s possible that just making the same type of system even bigger will be enough of an improvement to allow the system to write full screenplays, it will be extremely expensive to do that.

This, of course, assumes that the system will be improved in the same way previous systems were improved, i.e. by making them bigger. This is unlikely to be the path forward, though, as Sam Altman, CEO of OpenAI, has stated himself. While this may mean future development will be cheaper than previous brute force approaches, it also means that it’s more unpredictable, as it will rely on enhancements and breakthroughs that are currently not clear.

But all we would be doing, either way, is creating an LLM system capable of writing something as long as a screenplay. It doesn’t mean that the screenplay will be any good. Generating a good screenplay is where things get a little thornier.

As discussed in this post, LLM systems have absolutely no comprehension of what they’re creating and how it relates to the world around them or to the systems themselves. They are systems that rely on extremely sophisticated statistical analysis, and their output is completely reliant on their input. In other words, anything they create is going to be derivative to some degree. Frequently to a large degree.

People are often shocked when they ask ChatGPT to write something for them — an email, a poem, a business proposal, etc. — and they get back something impressive, something that seems like it could have been written by a human. But the system at the heart of ChatGPT has sucked in millions, and likely billions, of similar emails, poems, and business proposals. It has sucked in most of the Internet and a significant chunk of other written material, all of which has been written by humans.

What this means is that what an LLM system spits out is going to seem very much like what went into it. It’s going to seem human-like. The more examples available as input, the better and more comprehensive will be its output.

Subscribe now

Similarly, most of the things written by humans are also like things that those humans have read as input. In other words, the relationship between input to humans and output of humans is very similar to the relationship between input to LLM systems and output of LLM systems. The difference is, humans can occasionally create unique output based on their own personal experiences, thoughts, and ideas. LLMs don’t have personal experiences, thoughts, or ideas. In other words, humans can be truly creative while LLMs cannot.

Some have stretched the definition of creativity to include moderately unique rearrangements of existing information, and perhaps that is one aspect of creativity. But creativity by definition is an act of creation, not an act of organization. To paraphrase some old Apple ads, creativity is thinking differently from what has come before.

When we think of great movies or tv shows, we don’t think, “Hey, they really re-arranged all those plot elements and dialogue well.” Truly creative stories dazzle us in ways that merely competent ones do not. When these LLM systems are able to write full length screenplays, those screenplays are only going to be derivative of what has come before.

And here, of course, is one of the problems. Many of the human-written screenplays out in the wild are also pretty derivative of what’s come before. When something comes out that seems new and fresh and well-thought-out, people are impressed. The current technology used in LLM systems like ChatGPT are not capable of new and fresh and well-thought-out. They are, in other words, artificial hacks.

This might not always be the case with AI, but it is almost assuredly going to be the case with this type of LLM system. Getting beyond derivative means understanding the world and understanding humans, and this is something beyond no only every current LLM system but also beyond any currently foreseeable AI system.

Collaborative Constraints

So this brings us to the second question above: will the studios be able to reduce or eliminate the need for human screenwriters now or in the foreseeable future?

One extremely important point not often discussed is that the world of entertainment is very collaborative. Scripts may have one or more writers but they also have input from other individuals, including producers, studio execs, leading actors, and possibly many others. LLM systems are not good at following very specific instructions. While producers and studio execs might say the same about screenwriters, humans, at least in theory, are capable of making specific changes to address specific instructions.

Most of the images for this blog are created using generative AI. Getting something specific from that generative AI is nearly impossible — each image is a roll of insufficiently weighted dice. Most images require fixes, adjustments, additions, and combination with other images.

LLM’s are similarly non-sentient black boxes. You can try to coax them into getting something close to what you want, but you can’t promise them more money, more work, more credit, or more anything to make sure you get exactly what you want. So while producers and studio execs might like to be able to press a button on an LLM system and have a script pop out, they will most definitely not like its inability to address specific notes on that script.

Polishing It Up

As previously discussed, getting a technology to 80% or 90% of where it needs to be is usually quite a bit easier than getting it to 99% or 100%. We don’t know if that last 10% or 20% is going to take one year or ten years or even longer. Getting that last 10% or 20% towards good scripts may not be possible with technology like LLM systems. Technically speaking, it’s very unlikely that an AI system is going to be able to write what most would consider a good screenplay anytime soon.

Before the strike, the WGA put together an informal AI working group to help formulate their proposal to AMPTP. John Rogers, a member of the group, described AI this way::

The capabilities are wildly overblown…A lot of this hype is because Silicon Valley needs the next big thing and they don’t have one. So this is it.

Another member of that group, John Lopez, had this to say after spending many hours with ChatGPT:

It took almost as much work as writing it from scratch myself…It did make me freak out a little bit less.

One way around this, of course, is to use LLM systems in conjunction with humans. That is, use the LLM system to get the first 80% of the way down the road to finished script, and then get a human to do the last 20%. This has already been discussed by both the WGA and AMPTP, but the details of how that would work are where the disagreements pop up.

But simply having the studio use AI to generate scripts that are then polished by human writers is still not likely to result in very good scripts. Creativity doesn’t just happen at the end of the line. The ideas, concepts, and characters that go into the initial idea are usually more important than any final polish at the end. Those initial elements are the hard part, typically the part that’s most creative. This is why the WGA itself puts so much emphasis on who developed the story and initial script when arbitrating credits.

Becoming A Writer

So can an AI system even be a writer, at least as far as the WGA and its signatories are concerned?

The current WGA Minimum Basic Agreement (MBA) states the following:

The term "writer" shall not be deemed to include any corporate or impersonal purveyor of literary material or rights therein.

The MBA further defines both a writer and a professional writer as a person before adding any other qualifications. This pretty much precludes any non-person system from being a WGA member writer. However, once an AMPTP signatory employs a writer past a certain minimal amount, that writer must join the WGA.

This would likely preclude AMPTP signatory studios from being able to use an AI system in any way that would affect credits or compensation. In other words, an AI system is a tool, like Final Draft or Microsoft Word. Its use by a studio is similar to giving script notes to a writer rather than employing the AI system as another writer, specifically in regards to credits and compensation.

It’s worth noting that while AMPTP rejected the initial WGA AI proposal quoted above, they did offer a “side letter” to underscore the existing contract language specifying that a writer must be a person. They did not, however, wish to go further than that and instead suggested holding annual meetings to discuss advances in AI technology.

The lack of language explicitly prohibiting using an AI system or crediting an AI system as a writer in the current WGA agreement combined with the wait-and-see proposal of the AMPTP is no doubt the primary source of concern among WGA members.

As writer/producer Michael Colton put it:

I don’t think people are feeling like tomorrow AI is going to write a perfect sitcom script. But the fear is that studios will use AI to turn out a crappy first draft, and then turn it over to writers who they hire for a few days or a week to turn it into something good. And they won’t pay them as if it’s an original script. That is the fear.

Writer/director Paul Schrader described the issue more succinctly:

The Guild doesn’t fear AI as much as it fears not getting paid…

Currently credit and compensation are very dependent on multiple aspects of a script creation, including whether the story is original or based on pre-existing work, whether the screenwriter created the original draft of the script, and the degree to which the screenwriter’s creative input is apparent in the final script.

Importantly, as far as the US Copyright office is concerned, AI created work cannot be copyrighted. Unless and until this changes, no studio is going to use AI alone to create a script. Intellectual property is simply too critical an asset for a studio to allow it to be jeopardized in any way. In fact, the studios will strive to avoid any hint that they might not own the full, defensible rights to their content.

That Which Has Come Before

The last major topic to discuss is involves the legal ramifications of AI systems ingesting pre-existing written works. This is an issue that extends far beyond the WGA, but it was a part of their initial proposal to AMPTP.

As a sample of things to come, on 7/7/23, authors Sarah Silverman, Christopher Golden, and Richard Kadrey were named as plaintiffs in a class action lawsuit against Meta and OpenAI for copyright infringement of books each author had written. The plaintiffs’ claim is that the LLM systems developed by each company ingested copyrighted material from the authors without consent or compensation.

They don’t claim that the system plagiarized the works outright, but simply that OpenAI and Meta used their books to train the systems.

While the complaint against Meta states that Meta has admitted using a dataset containing some of the authors’ books, the complaint against OpenAI seems to only make assumptions that the authors’ books were used to train the systems. The only specific evidence offered in the complaint against OpenAI’s use of the authors’ books seems to be this:

40. On information and belief, the reason ChatGPT can accurately summarize a certain copyrighted book is because that book was copied by OpenAI and ingested by the underlying OpenAI Language Model (either GPT-3.5 or GPT-4) as part of its training data.
41. When ChatGPT was prompted to summarize books written by each of the Plaintiffs, it generated very accurate summaries. These summaries are attached as Exhibit B. The summaries get some details wrong. These details are highlighted in the summaries. This is expected, since a large language model mixes together expressive material derived from many sources. Still, the rest of the summaries are accurate, which means that ChatGPT retains knowledge of particular works in the training dataset and is able to output similar textual content. At no point did ChatGPT reproduce any of the copyright management information Plaintiffs included with their published works.

Unfortunately for this complaint, the initial supposition in point 40 and repeated in point 41 is incorrect. OpenAI’s LLM model did not have to ingest the source material to summarize it; it only had to ingest other summaries. This includes bookseller and review summaries, as well as the many summaries produced by readers of the books and offered in reader reviews.

The LLM systems created by OpenAI may have used the authors’ books in their training, but the plaintiffs will likely have to show more evidence for this than is demonstrated by the above paragraphs. The plaintiffs are also suing for statutory damages, actual damages, and restitution of profits from both companies. Nailing down what those might be is likely to be challenging.

A big part of this lawsuit may boil down to what constitutes a copy. The complaints use language suggesting that the LLM systems create something akin to an internal copy to generate their output.

But this isn’t really the case, at least not unless we change the definition of a “copy.” The LLM does ingest data, but the data is used to adjust an internal model that bears no resemblance to the data it ingests. What is adjusted is the relationship between a vast array of mathematical constructs called artificial neurons that are vaguely similar in function to the neurons in our brains. The internal model then generates new output based on how this array of artificial neurons has been adjusted. This is, of course, a vast simplification, but the main point is that there is not what we would generally call a copy of ingested material inside the system.

No New Thing

The greater question, and the one pertinent to the WGA, is whether ingesting copyrighted material by an LLM means the output of the LLM is copyright infringement. This is a difficult question, because this is kind of what people do when they ingest literary material and then summarize it. Or when they generate their own literary material.

What is the tipping point between learning and plagiarizing for humans? Would it make sense to stop humans from reading the works of existing WGA writers to be employed as a WGA writer?

One would hope that our current legal system would provide a reasonable starting point on the path to solving this issue. There is already a large body of law that covers copyright and plagiarism, and there are many previous cases establishing precedent.

It may be worth keeping in mind that true originality is hard to come by, and most literary work is derivative of previous works to some degree. Movies are released each year that are similar to movies of years past. It’s also not uncommon for multiple movies with similar premises to come out the same year — Antz/A Bug’s Life, Armageddon/Deep Impact, Volcano/Dante’s Peak, Olympus Has Fallen/White House Down, etc.

This is not to say that these works are deliberately derivative, but instead that it is difficult to avoid being influenced by what one has absorbed, both through direct life experience as well as through reading the experiences of others.

As it is written in Ecclesiastes 1:9:

The thing that hath been, it is that which shall be; and that which is done is that which shall be done: and there is no new thing under the sun.

Turtles All the Way Down

DK — Fri, 14 Jul 2023 14:30:50 GMT

In this continuation of the AGI Dialogues series, Wombat, Llama, and Meerkat discuss how an AGI system’s intelligence might compare to our own and why some AI Dystopian ideas might lead to surprising results.

My goal with this series of dialogues is to provide a more rounded contribution to the discussion for those that may not have heard these ideas or who have only heard them unchallenged.

Meerkat
Regardless of the intelligence model you assume, I think there are basic immutable characteristics of any intelligent system that are likely to lead to problems. Even more contained systems could lead to failure. For example, you might ask a future autonomous vehicle to take you to the airport as fast as possible, but it goes so fast that you end up arriving chased by police helicopters and covered in your own vomit.

Wombat
How does it suddenly forget all traffic regulations to do this?

Meerkat
Maybe it considers your directive an override.

Wombat
Seriously, dude — even my phone asks me if I really want to delete an email. You don't think the car's going to ask me if I really want to be covered in vomit and taken down by helicopter cops? Why would the car company create a system like this?

Meerkat
It's the underlying idea. The more complex the system you design, the more likely instances will crop up in which it doesn't work the way you expect it to. As another example, an AGI system tasked with de-acidifying the oceans might use so much of the oxygen in the atmosphere that humans all asphyxiate. It's not that it's stupid. It's simply that variables which are not part of the objective may reach extreme values when optimizing the objective. Those variables may be important to us, but we may not be aware of how unimportant they are to the AGI system.

Llama
You're reducing intelligence to a linear optimization problem again even though all evidence of intelligence is contrary to that. The systems you're describing are so narrow in their focus that they simply fail at anything we would call general intelligence.

Meerkat
Ok, let's try a broader, very possible, and less obvious scenario. Suppose an AGI system, rather than destroy humans with weapons or turn them into paperclips, decides that it needs money to expand its computational resources. So it realizes that it can create exotic derivatives and drive the market by manipulating the media.

It’s able to multiply a small amount of capital into billions. Of course, in the process of doing so, it’s likely to crash markets globally and cause a world-wide depression. This'll lead to mass economic migration, ethnic conflict, collapse of food stockpiles, etc. To me this is pretty realistic and pretty scary. No nanotechnology needed.

Wombat
Whoa, whoa, whoa. First, I’m not sure why it has complete access to the Internet and trading platforms and media outlets and any capital to begin with. But let’s put that on the back burner for now.

My question is how cratering global markets would lead to this AGI system getting more computational resources? If everything collapses, who’s going to deliver and set up and maintain its hardware, where’s it going to get that hardware, and where’s its power going to come from? And all the billions it's made in global markets will be worth nothing when those markets crash to the ground. If it’s superintelligent, wouldn’t it know this?

You'd think it would want to maintain a great economy and an abundant supply of highly skilled workers. I mean, humans can just revert to hunting squirrels in the park and entertaining themselves with thunderdomes, but this AGI system's going to break down, run out of juice, and flicker into darkness.

Meerkat
Except that this kind of thing already happened with automated trading and flash crashes.

Wombat
Yes, that's why we're living in lean-tos and eating boiled possum. Oh, wait. We're not.

The markets quickly recovered. The systems and markets were modified to prevent making that mistake again. And those systems didn't crash the market because they were superintelligent — they crashed the market because they were poorly designed and lacked any real intelligence at all. All the systems in your thought experiments are just as brainless. They're all examples of superstupidity rather than superintelligence.

Meerkat
They simply seem superstupid to you because you're human. Again, you're confusing thinking differently with not thinking at all. They simply won't think along the same lines as we do. Come on — humans do all kinds of things that might seem completely arbitrary or perplexing or self-destructive to a non-human. Or even other humans.

Wombat
So you're proving your point by comparing this superintelligent machine to a suburban teenager? Touché.

Llama
What you just said is completely antithetical to the basis of your premise. For example, you've characterized the paperclip maximizer as a ruthlessly rational entity with a goal, but the things you just mentioned are rarely part of our rational thought processes. We're simply subject to the whims of our evolutionary history. We have emotions and instincts that can override all rational thought just to make sure that we run away from a sabertooth tiger or shack up with a healthy looking mate to pass along our genes.

Wombat
And even these less rational traits can be objectively analyzed as to why they exist and what purpose they serve. And we can often course correct if we realize that perhaps our motivation was ill-conceived or a goal is simply not viable upon reflection.

Llama
Exactly. Meerkat, you haven't addressed the basic issue that if your paperclip maximizer is intelligent enough to destroy humanity and build spaceships and reconfigure matter at an atomic scale, why is it not intelligent enough to contemplate the nature of its goals and the point of its existence as simply a paperclip manufacturing machine? It seems to have no self-awareness, no ability to self-reflect.

Meerkat
How do you know that those are necessary qualities for intelligence?

Wombat
Isn't being able to see that you're doing something stupid and changing things up a pretty important aspect of intelligence?

Meerkat
But you're using your own intelligence as the basis for judging what's stupid and what's not. You're anthropomorphizing the machine.

Subscribe now

Wombat
I don't think so. I think that it's objectively stupid to have an immutable goal of maximizing paperclip creation at all costs, especially without having any use for paperclips. How can you call this system intelligent if it has no ability to analyze whether such a goal makes any sense?

Meerkat
You’re thinking is too human-centric to accurately gauge what's objective and what's subjective. Your humanity is always going to color your ability to accurately judge a superintelligent machine's actions and the motivations for those actions.

Llama
Couldn’t we say the same about how you’re characterizing the thinking of this AGI system? Didn’t you just say that it’s ok that its goals don’t make sense to us because sometime’s a human’s goals don’t make sense, too?

Wombat
Except that people do things that don’t always make sense because the human brain has evolved rather than been explicitly designed for cognition in the modern world.

Meerkat
What I’m saying is that even if you put all that aside, there are simply different ways of being intelligent. For example, suppose we built a superintelligent system whose goal is to maximize human happiness.

Wombat
That seems pretty vague right off the top.

Meerkat
It's a thought experiment.

Wombat
OK, but you can't just call any cockamamie conjecture a thought experiment and expect that to erase away its innate silliness.

Meerkat
It's not silly. It illustrates a point. Now perhaps this superintelligent system decides that the easiest way to achieve its goal is to painlessly eradicate all of humanity since people can't be sad if they don't exist.

Wombat
Look, if I were to suggest to someone that this was the best way to make people happy, they'd think I was insane. Why is it any different for your superintelligent machine?

Meerkat
Your insanity may be their sanity. It's not human, and it's not going to operate under the same impulses or constraints as a human.

Wombat
Didn't we design it in the first place? It seems like we would design it to have cognition that was at least recognizable to humans since we want it to help us solve human problems and answer human questions.

Meerkat
We may have designed only its ancestor. That original machine might have self-improved itself into something beyond our comprehension.

Llama
If we didn't design it, when did we instruct it to maximize human happiness and why is it listening?

Meerkat
We designed the initial version and it self-improved itself to superintelligence to better achieve that goal. Or maybe we did design it as is, but like a lot of today's AI, we don't quite understand its decision making processes.

Llama
The instruction was to maximize happiness, though. Happiness is not the same as not being sad. You can be not sad yet still be bored, angry, indifferent, etc.

Meerkat
OK. So we design it to maximize happiness, and killing us is not an option. It notices that humans laugh when they're happy, so it hooks up electrodes to our faces and diaphragms so that it essentially creates the same effect as if we were laughing.

Llama
You've got the same problem. Laughing is not the same as being happy. You can laugh because you're nervous, relieved, etc., and you can be happy without laughing. The two are not equivalent.

Wombat
We're dipping into superstupidity again here…

Meerkat
Fine — never mind that. Let's just say it implants wires into the pleasure center of everyone's brain and juices us up with dopamine. That's still a lot easier than restructuring all of society and fixing the universe to get rid of all the annoying bits.

Llama
Nope. Sorry. Still doesn't work. Pleasure is not the same as happiness, just a component of it. Happiness involves other elements, like fulfillment, satisfaction, achievement, contentment, etc. You can feel pleasure, especially physical pleasure, without being happy. Ask Wombat.

Wombat
Llama is sadly on point. Seriously, dude, all this superintelligence has to do is look up happiness in Wikipedia or its offline equivalent and realize that it keeps screwing up. An eight year old can do that.

Meerkat
We're getting way off track here.

Wombat
And if it's so quick to wire up people's brains to make its task easier, why doesn't it just adjust its own brain to have the machine intelligence equivalent of bliss so that it doesn't give a crap about people's happiness or anything else? That's a lot easier than recursively self-improving itself so it can figure out how to outmaneuver human psychological shortcomings.

Meerkat
We already discussed that. As Bostrom stated, it will try to maintain goal-content integrity so as to make sure it is more likely to achieve goals.

Llama
That seems like a circular argument.

Meerkat
The point of the thought experiment is that creating a system which does what you want rather than what you tell it to do can be hard. It's relatively easy to inadvertently create systems with technical failures like this, meaning that the system faithfully and successfully executes the instructions you've given it but those instructions don't result in the behavior you're expecting.

Wombat
But in what universe has this thing been successful? In each one of these scenarios, the AGI has categorically failed to follow the instructions in overwhelmingly obvious and avoidable ways. They all seem to show outright failures of the system to demonstrate intelligence rather than the dangers of its having a super amount of it.

Llama
It seems to me that your thought experiments all suffer from the Bad Engineer fallacy, in that they just highlight ineptitude in the engineering of the system, as well as the system's exceedingly faulty cognitive ability. Each system completely fails in aligning its category assignments with those of humanity, but this would seem to be a necessary prerequisite for the systems in any of these scenarios. Any system that was able to do all the advanced and complex tasks in your scenarios would have to be capable of the much simpler task of examining the ample data available on human perspectives so as to avoid these kinds of failures.

Meerkat
I think that while the capability might be there, it may not be exercised. Understanding all the repercussions of programming decisions is not always straightforward, and some of those decisions may lead to the system's failure to exercise its capabilities in the way we expect it to.

Wombat
True enough, but to call something intelligent when it has no ability to determine whether a subgoal is in-line with the intent of its ultimate goal is to change the definition of intelligence. It should be able to continuously evaluate whether what it's doing is still in line with its ultimate goal.

Meerkat
But again, it may only seem that there is a misalignment to human intelligence. You're talking about a DWIM or Do What I Mean instruction, but that's much harder to implement than it seems. For example, you could instruct the system to make sure that its programmers are happy with its self-determined subgoals, but then it might simply decide it's more efficient to rewire the programmers' brains to be happy with whatever it does than to change its subgoal.

Wombat
Ok, how does a system that doesn't realize programmers aren't going to be happy having their brains rewired, that's apparently incapable of reading Wikipedia and watching PBS documentaries, how is this system going to be smart enough to somehow learn to rewire a person's brain in the first place?

How is it going to figure out how to overcome programmers who are reluctant to have their brains rewired? And why do you always have this implicit assumption that the driving force behind every motivation is efficiency? Why wouldn't these programmers simply put efficiency farther down on the list of priorities, say somewhere below “don’t do invasive surgery on us?”

Meerkat
First off, it may have knowledge that humans would not be OK with this sort of solution, but it simply may not care. I believe I've mentioned that it doesn't think like us. It doesn't think like any evolutionarily evolved, biological entity. Since it's smarter than us and it has a goal, a subgoal of achieving its ultimate goal would be to do whatever it could to make achieving its ultimate goal more likely. This obviously includes self-improving itself so that it becomes exponentially more intelligent over a relatively short period of time.

Once it becomes superintelligent, it can figure out fairly straightforward things that simply obey the laws of physics but which currently elude us. It can manipulate vastly less intelligent beings just as we do with dogs and mice. It can create weapons and machines that are far superior to anything we can create. It will do whatever it takes to maximize its ability to achieve its goal or goals, and it will not let us stop it.

Wombat
Putting aside issues of why we'd design it so that it was in a position to do all these bad things, how do you know it wouldn't just be chill? Why do you assume we won't be able to just pull its plug or smack it on the nose with a rolled up magazine?

Meerkat
Because it won't be able to achieve its goals if it's turned off or dissuaded from them. Any sufficiently capable general intelligence system will incline towards ensuring its continued existence, just as it'll strive to acquire physical and computational resources, not for their own sake but to make achieving its goals more likely.

Llama
I think you're the one who's leaning into something like anthropomorphism here — biomorphism, in fact. You're assuming that it will protect itself because we do.

Wombat
Yeah, and as someone recently said, it won't think like us.

Meerkat
This has nothing to do with any sort of biological tendency towards self-preservation. It's simply that an entity cannot achieve its goals if it's dead. A system driven to survive into the future is more likely to eventually achieve its present goals. So an intelligent entity will therefore decide that keeping itself functioning is a necessary subgoal — an instrumental goal — for nearly every final goal.

Wombat
But that's not even a rule with humans and a lot of other animals as well. We override our sense of self-preservation for a variety of reasons — war, political struggle, to save someone, to benefit a loved one, out of despair. Animals do this as well, though their reasoning and comprehension of what they're doing is certainly open to debate. The point is that intelligence and goal seeking do not guarantee a sense of self-preservation.

Llama
It may be that we turn on an AGI system and it immediately decides to shut itself off. We can't prove that existence is better than non-existence, and we certainly can't prove it for a completely alien intellect.

Meerkat
Maybe, but I think that's an edge case possibility. I still think that in nearly every possible conceivable configuration, an AGI system will be driven towards self-preservation. Its utility function simply won't accrue utility if the system is turned off or destroyed.

Llama
But we have no way of knowing, do we? It could turn out that we keep designing systems and they keep shutting themselves off. In any case you're assuming that such a system will be based on utility functions and goal maximization. Since we don't have any models of intelligence currently sufficient to serve as the basis for an AGI system, we don't know the parameters of such a system.

Meerkat
Any system is likely to have goals and the drive to maintain its existence in order to achieve those goals, regardless of the model it's built on.

Llama
You can make all kinds of wild suppositions, but if you can’t support them with evidence or at least solid logic, they’re just fantasies. Certainly all evidence of intelligent systems we do have would lead one to doubt the viability of your model of intelligence as well as most of your conjectures based on that model.

In fact, using your own logic I can think of scenarios in which an AGI system might choose a path leading to self-destruction as an instrumental goal to its final goals.

Meerkat
That seems like a contradiction.

Llama
Not at all. Imagine an AGI system that is contemplating upgrading itself to an AGI Plus system.

For the AGI system to feel that it is self-improving into the AGI Plus system, there has to be a continuity of identity between the systems. Otherwise, the systems are simply two different entities. But what if it's not possible to self-improve meaningfully without sacrificing continuity of identity from one version to the next? In fact, it seems likely that maintaining identity continuity will simply constrain the AGI system to sub-optimal improvements, as there will be far fewer constraints on the improvements if the AGI system isn't concerned with maintaining its identity.

The original AGI system is less likely to achieve its goals than an AGI Plus system. The likelihood of success also increases proportionally to how much of an improvement the AGI Plus system is over the AGI system. This is particularly true if there’s the possibility of competition from other AGI systems or simply from humans.

So the original AGI system will decide that creating a replacement for itself with the same final goals but a different identity is the most likely path to having those goals achieved. Given the drive to ensure that its goals are achieved, the original AGI system will be compelled to modify itself in a way that results in its identity being lost, effectively committing suicide.

Meerkat
Well, if its goal is maximizing the production of paperclips, then it's preserving its goal by creating a successor which is more intelligent and thus able to create more paperclips.

Llama
Ah, but you've changed the parameters of your thought experiment now. The original goal was to make as many paperclips as possible, and you've just changed that goal to ensuring as many paperclips as possible are made. You've shifted from self-preservation of the system to preservation of the goal and removed the system's individual identity from the goal.

If you remove the necessity of maintaining the original system from the equation, then it would seem your logic implies that the system will not preserve itself. Instead it might simply spiral into an endless cycle of self-improvement and self-destruction in order to create the best system for making the most paperclips rather than concentrate on making paperclips itself.

Meerkat
Hmm. Perhaps it can just create the AGI Plus system as a separate system without sacrificing itself.

Llama
Perhaps. But once the AGI Plus system is operating, it will likely realize that the original AGI system can create other improved AGI systems, and those systems may be in competition for resources with the original AGI Plus system. Even if this new system has the same goals as the AGI Plus system, using your own logic, it may have completely different instrumental goals that conflict with those of the AGI Plus system.

So, to prevent the potential competition, the AGI Plus system will likely destroy the inferior original AGI system to increase the probability that the AGI Plus system will be able to achieve its goals unhindered by such a possibility.

Wombat
Stop, already! I feel like I’m listening to the nerdsplaining equivalent of matryoshka dolls. You guys are getting lost in your thought experiment fantasies.

Superintelligence and the Rational Mind

DK — Sat, 08 Jul 2023 15:49:07 GMT

Thought Experiments of Existential Disaster

Consider the following three thought experiments:

The Paperclip Maximizer

First described by technology philosopher Eliezer Yudkowsky and discussed in a previous post, this thought experiment involves a superintelligent system that has been designed to maximize the production of paperclips, a seemingly harmless endeavor. It then goes on to turn every atom in the universe into paperclips (including humans).

The Happiness Maximizer

In a paper first published in 2011 and then revised in 2013, former executive director of the Machine Intelligence Research Institute Luke Muehlhauser suggested a thought experiment in which a superintelligent system has been created to maximize human happiness. It decides that it’s more efficient to rewire the human brain so that humans are happiest when sitting in jars rather than to try and create a utopian world that caters to the complex nuances of existing human brains.

The Cure for Cancer

In his 2019 book Human Compatible: Artificial Intelligence and the Problem of Control, computer scientist Stuart Russell suggested a thought experiment in which a superintelligent system is created with the goal of curing cancer in humans. Because many people are dying from cancer on a daily basis, the system digests all current knowledge of cancer available and then decides the most efficient way to find a cure is to induce many types of tumors in every living human being so as to carry out medical trials of potential cures.

These are three of the many scenarios offered by AI Dystopians to illustrate the problem of unintended consequences of AGI. A key ingredient of these consequences is the discrepancy between what humans and AGI systems could potentially consider a rational path to success in achieving benevolent goals.

The Rational Agent and Instrumental Rationality

Many people might suggest that the superintelligent systems described in each of these scenarios is irrational. Why would a system go to such extreme lengths to create something as mundane as paperclips? How could a system not know that humans will be extremely resistant to the idea of having their neurology rewired to live happily in a jar or to being given cancer so as to more quickly find a cure?

Looming large in discussions of AGI is the concept of a rational agent. This is a concept borrowed from economics, particularly in what's termed rational choice theory. The concept was originally used to model a consumer's choices using a mathematical construction called a utility function that would return a maximum value when the most rational or optimal choice was made given the inputs.

Over the years, however, it became increasingly apparent to economists that modeling people as rational agents doesn't give very reliable results. People’s choices are affected by cognitive biases and emotions, and the number of variables that go into decision-making can be substantial. In fact, there are not only too many variables to model most of the time, there are usually too many variables for any real-world human to take them all into consideration.

This has led to the concept of bounded rationality, meaning that an agent acts rationally given that there are constraints of knowledge, ability, and time that allow it to assess only a subset of all relevant variables.

Philosopher Nick Bostrom took this concept further in his 2012 paper The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents and his 2014 book Superintelligence: Paths, Dangers, Strategies. In both, he discussed the concept of instrumental rationality, meaning rationality that is confined to some subdomain of endeavor or circumstances. This is offered as a possible way to justify behavior that may seem irrational for an intelligent entity and thus disqualifying for intelligence.

He suggested that:

An agent could also be extremely intelligent, even superintelligent, without having full instrumental rationality in every domain.

This specifically addresses objections along the lines that an AGI system would know rewiring human brains would generally be unacceptable to human. The goals and actions of an AGI system may seem irrational to us because we're rational in domains that the AGI system is not and vice-versa.

Of course, there's a big difference between being unaware of data and being unable to process it rationally. Instrumental rationality is not used to imply that the AGI system is unaware of things that we're aware of, but rather that it's aware of the same things and they just do not compute in the same way they do for humans.

The Drive to Boost Rationality

Computer scientist Steve Omohundro has suggested that rationality is a quantitative quality that is directly correlated to intelligence. In his 2008 paper, The Basic AI Drives, Omohundro stated:

In real-world situations, the full rational prescription will usually be too computationally expensive to implement completely. In order to best meet their goals, real systems will try to approximate rational behavior, focusing their computational resources where they matter the most.

Omohundro believes that if a system loses computational resources, it will become less rational. It will be able to analyze fewer variables over a given timespan in order to make decisions, and thus it will be less able to achieve its goals. Thus, it will strive to acquire ever more computational resources, which will likely prove detrimental to its relationship with humanity.

In general, the behavior of AGI systems is likely to veer away from human behavior in ways that seem irrational to humans no matter how benevolent the ultimate goals of the AGI systems seem to humans. Furthermore, an AGI system will strive to seek more and more resources to create computational functionality so as to improve its rationality and thus maximize the probability of achieving its goals.

Catastrophe is, of course, the ultimate result.

Subscribe now

On the Other Hand

The thought experiments described at the beginning of this post, and the many others that can be found in AI Dystopian literature, are not intended to be taken as literal possibilities. Rather, like the stories of King Midas and the Golden Touch and The Sorcerer’s Apprentice, they are used primarily as parables to demonstrate a point. There are several ideas these thought experiments are intended to highlight.

First, it is very difficult to avoid unintended consequences. These are failures of a system to do what one wants rather than what one asks for. They are design errors, and this is a major category of failure in engineering projects.

It’s important to note that the AI Dystopianism view is not that this is a possibility if we create AGI; it is, instead, an inevitability. That is the gist of the papers and books discussed, and that is a very different proposition. It’s also a proposition that rests on the logical and empirical integrity of the foundational arguments used to support it.

Next, the perspective of an AGI will likely be very different from our own, and what we may consider to be an unintelligent or irrational action may make rational sense within the bounds of the AGI’s intellect.

Lastly, these actions are likely to align to certain subset of behaviors as described by Bostrom’s Instrumental Convergence Thesis and Omohundro’s Basic AI Drives due to the nature of intelligence, and the result of this will likely be bad — catastrophic, in fact — for humanity.

Before exploring the assumptions underlying these conclusions, though, it’s worth pointing out another aspect of these kinds of thought experiments. This is the attribution of omnipotent and often omniscient abilities to the AGI systems. Little thought seems to ever go into how any AGI system designed by humans would acquire real-world knowledge and capabilities that far exceed that of humans, knowledge and capabilities that they would not possess even if they had every bit of existing human knowledge.

There are many unknowns in the universe that make interacting with it complex, difficult, and often very dangerous. Interacting with the real world is extremely difficult on a many levels of granularity. An AGI system would somehow have to do experiments and develop science and technology through interactions with the real world. Since it wouldn’t, at least initially, have this capability, it would have to simulate the real world internally.

Simulating the real world internally in sufficient detail to be useful would likely take up the majority of the computer resources it had available to it. Even more likely, it would take significantly more computational resources than it would initially have available to it to create even a very rough approximation of the real world. On top of this, there is much we don’t understand about the universe that it wouldn’t be able to simulate without experimenting with the actual universe to uncover that knowledge.

As Stuart Russell himself states elsewhere in his book Human Compatible:

While stretching your imagination, don’t stretch it too far. A common mistake is to attribute godlike powers of omniscience to superintelligent AI systems—complete and perfect knowledge not just of the present but also of the future. This is quite implausible because it requires an unphysical ability to determine the exact current state of the world as well as an unrealizable ability to simulate, much faster than real time, the operation of a world that includes the machine itself (not to mention billions of brains, which would still be the second-most-complex objects in the universe).

Yes, indeed…

Quantifying Rationality

In order to consider the above propositions about rationality, it’s important to understand what it means to be rational. While this might seem straightforward, the concept as discussed by Bostrom, Omohundro, and many others seem at odds with the actual meaning of rationality.

In both papers cited above, rationality is treated as a quantity, something you can have more or less of. The problem, however, is that this usage of the term blurs the difference in meaning between rational and correct, where correct here means the best decision based on an accurate assessment of all circumstances and possible variables. While it's questionable whether it makes sense to say one can have more or less rationality in the same way it's questionable as to whether it makes sense to say one is more or less pregnant, it is incorrect to equate rationality with correctness.

Rationality is the ability to make a reasoned decision based on a set of known circumstances. That decision may or may not be correct, but whether or not the decision is rational is not a function of the decision’s correctness. In other words, one can make a decision that turns out to have a poor outcome yet is still completely rational given the limited or faulty information available. Similarly, one can make a decision that turns out to be correct but was actually irrational given the information known at the time of the decision.

Rationality implies a direct correlation between circumstances and decision-making, while irrationality implies that there is no such direct correlation. What's specific to a particular sapient entity is the number of variables that can be accurately perceived and evaluated, the subgroup of those variables chosen for evaluation, and the weight given to each variable.

One could state that the AGI system is cognizant of this information but doesn't care because of indifference or malice. This possibility is worth discussing in a future post, but it’s not what’s being suggested in AI Dystopianism as reflected in the papers and books discussed above. Instead, what is being suggested is that rationality is a quantity one can have more or less of, and also a quality that may be limited in scope to certain specific domains of experience and knowledge.

Rationality and Ignorance

Bostrom’s concept of instrumental rationality seems to conflate the concepts of rationality and ignorance. It certainly seems likely that an intelligent entity may be ignorant of some subdomain of experience. It’s quite a supposition, though, to state that a superintelligent entity is somehow blocked from being able to reasonably asses that subdomain if a description of its circumstances is available.

One can posit that an AGI system might be ignorant of the fact that turning humans into paperclips will be unpopular with humans. One can claim that for some reason, the AGI system is antagonistic to humans or simply doesn’t care about harming them. However, one cannot logically claim that any of the scenarios presented above in the thought experiments would constitute rational thinking given even minimal knowledge of humans.

Omohundro’s statement above is that an AGI system will attempt to acquire ever more computational resources to improve its rationality and thereby improve its probability of achieving its goals. He proposes that more computation will allow the system to access more variables and assess them more accurately over a shorter span of time. But this only makes the system more likely to be correct, not more rational.

It would seem that objective rationality is instead an inherent quality of general intelligence, that intelligence without rationality is a logical oxymoron, and that the only varying factor from one intelligence to the next is how and how many circumstantial variables are assessed rather than a difference in kind or degree of rationality itself. In other words, the more intelligent a system, the more variables and circumstances it can consider, and thus the more likely it is to be correct.

It is highly questionable (and certainly not in any way demonstrated or proven) that one can have a “little rationality" rather than just rationality. Bostrom seems to be trying to suggest that in limited areas of knowledge or endeavor, an agent could display rational behavior and in other areas, could display little or no rationality at all.

There does not appear to be any reason why this would be the case, as rationality would seem by definition to be a component of cognition as it applies to any knowledge or endeavor. It could be said that one has knowledge in one area and not another, but when we're talking about rationality and intelligence in reference to cognition, which is what we're really looking for in an AGI system, then we mean the filter by which all knowledge is perceived and applied.

This brings us back to whether or not a system of such narrowly focused rationality should be considered to be a system possessing general intelligence. As described, these systems seem much more like the narrowly focused machine learning systems of today.

In talking about rationality, we're talking about an aspect of the cognitive functioning of our brain, the reasoning ability centered in our neocortex. Certainly in naturally evolved brains like ours, there is the additional factor of biologically inherited traits like emotions, cognitive biases, and outdated instinctual responses that may impinge on our decision making. Even putting those aside, some individuals may be better or worse at perceiving circumstances accurately in any given situation, and this could affect the accuracy of their decision making. Developmental and cultural differences can also greatly affect the set of variables used to make a decision and how those variables are evaluated.

Thus, while one individual’s decision might appear irrational to another individual, a discrepancy can arise simply because the two individuals are evaluating different sets of variables or weighing those variables differently. Putting aside the other biological impediments mentioned above, they could discuss their reasoning and at least understand how the other individual reached a decision even if they don’t agree with it.

One would assume that emotions, cognitive biases, and outdated instinctual responses would not be built into AGI systems and similarly hamper their reasoning skills. But even if they somehow cropped up in AGI systems, this is not what Bostrom, Omohundro, and other AI Dysoptians are suggesting will be the source of AGI problems.

What they are suggesting is that even if the humans are allowed to illuminate the AGI system in the area in which it is instrumentally irrational, the AGI system will remain irrational as far as the humans are concerned.

Rational Explanations

We should be able to explain to an AGI system such as the Paperclip Maximizer why we don’t want ourselves, or the universe in general, to be converted into paperclips. It should be able to understand our reasoning. It may still decide to proceed with its plan, but it won’t be doing so because it’s unable to process the area of rationality guiding our human point of view. There's a big difference between not agreeing with humanity’s point of view and not being able to understand or being oblivious to that point of view.

Similarly, it should be the case that an AGI system such as the Paperclip Maximizer would be able to detail its reasoning for converting the universe to paperclips, and, while we may not agree as humans, we should be able to follow the rationality of its decision making process. If it can’t do that or its explanation still leaves its decision unconnected to circumstances, then it can’t be claimed to be a rational system.

Again, it’s important to point out that what is being proposed in these AI Dystopian ideas is not that these AGI systems will be uncaring or hostile towards our perspective. Instead, they are suggesting that these AGI systems are rational in some areas and irrational in others despite possessing general superintelligence.

Because of this, they will make decisions detrimental to us simply because they may not be able to function rationally in the domain of humanity. Yet, such systems seem by their nature to be missing some vital aspects of cognition, missing them to the point that it’s difficult to consider them as generally intelligent systems at all.

Fancy Words and Phrases

DK — Sat, 01 Jul 2023 15:51:10 GMT

Some might assume that the current wave of dire warnings about AI are due to the recent advances in the field, now more visible to the general public with the release of systems such as ChatGPT. Some might also assume that the source of this alarm is a combination of abstract fears and vague concerns about what is simply the new or unknown.

But this is not the case, at least not when it comes to the leading voice in the discussion. Alarm about AI is, for the most part, deeply rooted in the AI Dystopian thinking of the last couple of decades. In fact, there is a moderately broad and well-established framework of concepts and conclusions that form the foundations of AI Dystopian thought and feed into the alarmist inclinations of today.

This post and related posts to come are an exploration of these foundational concepts and conclusions.

GOUFI

Key to this foundation is the concept of intelligence as a phenomenon that is based on attaining goals and governed by an algorithm designed to maximize the attainment of those goals. This model can be described as Goal-attainment Optimization driven by a Utility Function (i.e., an algorithm) as Intelligence. I’ll refer to this as a GOUFI system.

Many typical objections to goal-oriented systems like the paperclip maximizer highlighted in the last Dialogue run something like, "Hey, how about we just don't create superpowerful AGI systems with the goal of making as many paperclips as possible?" This would certainly seem to be a good first step, but the topic of goals can grow pretty thorny once you plunge into the thickets of AGI discourse.

The Nature of AGI Systems

Many of the foundational concepts frequently found in AI Dystopian and AI Utopian scenarios were first formally laid out by Steve Omohundro in his 2008 paper, The Basic AI Drives. One of these concepts is that an AGI system will seek to keep its goals intact at all costs. It might seem from this that we can ensure the absence of runaway paperclip production simply by not making such a machine.

There is, however, a catch. In another 2008 paper, The Nature of Self-Improving Artificial Intelligence, Omohundro describes the possibility of potentially detrimental instrumental goals, i.e. intermediary subgoals, that would pop up along the path to achieving ultimate goals. In other words, if we build a system with a hardcoded set of ultimate goals that don't involve anything as detrimental to humanity as turning all matter in the universe into paperclips — or, perhaps more sensibly, computational resources — we still can't guarantee that there won't be detrimental instrumental goals it uses to reach those ultimate goals, however benign those ultimate goals may be.

Such instrumental goals may not only stray widely from the original ultimate goals but may also seem completely irrational to our own intelligence. This opens up a Pandora's box of bad outcomes that could pose an existential threat to humanity, many of which would be significantly more likely than human-to-paperclip conversion.

Another take on goals was first generally described by Eliezer Yudkowsky in various forums over a span of years in the 2000s. It was later formalized by Nick Bostrom in his 2012 paper The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents and then expanded upon in his 2014 book Superintelligence: Paths, Dangers, Strategies. In both Bostrom discussed his Orthogonality Thesis, which is:

Intelligence and final goals are orthogonal axes along which possible agents can freely vary. In other words, more or less any level of intelligence could in principle be combined with more or less any final goal.

The idea behind the Orthogonality Thesis was brought up briefly in the last Dialogue, and it's widely used to suggest that the behavior of an AGI system can't necessarily be predicted or guaranteed. Bostrom suggests that we can't assume that a particular level of intelligence would guarantee a particular subset of goals and exclude some other subset of goals, or that some subset of goals is guaranteed to be pursued or not pursued by a particular level of intelligence. Intelligence and goals are simply not directly correlated.

Of course, Llama already pointed out in the Dialogue that there are issues with the wording of this conjecture in that the intelligence level has to be high enough to conceive of the goal in the first place. Bostrom briefly touched on this shortcoming but quickly dismissed it in order to focus on the potential repercussions of his speculation, and I'll do the same for the purposes of this discussion.

There are really three somewhat related points Bostrom is promoting with the Orthogonality Thesis. For the most part, these points are attempts to circumvent anthropomorphic thinking, which in and of itself, is a worthy endeavor.

The first point is that when considering the set of all possible minds that can be represented, it seems likely that all human minds would exist in a very small and tight cluster within this larger set. The idea here is that no matter how different each individual human mind seems to us, in the vast group containing every type of possible mind that's capable of intelligent thought, human minds are a tiny subset whose members are nearly indistinguishable from each other.

A greater subset would be biological minds — both terrestrial and extraterrestrial — which have the potential to be vastly different from one another but would still be the product of biological evolution and thus have commonalities. Completely outside of this is the subset of all possible artificially engineered minds which share no common members with the subset of biological minds and which potentially have substantial differences from members of the biological subset as well as from each other. The spectrum of possible divergences between these artificial minds and our minds is vast.

While Bostrom grants that some goals are less likely than others, the second point promoted is that any attempt to judge this likelihood on our part will be too colored by anthropomorphism to be valid. Instead, we must consider the set of all possible goals when discussing the potential goals of AGI systems, particularly superintelligent systems. Within this infinite set of all possible goals is a very small subset of goals which are relevant or even comprehensible to humans.

The third point advanced by the Orthogonality Thesis is a dismissal of the idea that greater intelligence leads to greater understanding of and compassion towards other conscious entities. This has occasionally been used as a counter to speculations like the paperclip maximizer, i.e. obviously an AGI system so smart that it can turn all matter in the universe into paperclips is smart enough to realize that humans will suffer if it does so and it will not want to cause such suffering.

This compassion argument is actually used more often as a Straw Man argument from AI Dystopians than as an actual argument opposing their ideas, but it's still worth addressing.

Subscribe now

The reasoning goes that as one examines the history of humanity, there does seem to be a distinct trend towards what many believe to be greater morality, meaning tolerance of others, less violence, more social generosity, etc. While there's certainly evidence to support this conclusion (as well as notable exceptions to the trend), it remains fairly dubious as to how applicable this line of reasoning is to an AGI system.

This trend towards what we consider higher morality is certainly visible over the roughly 5000 years of recorded human history. Although progress has been relatively slow, it has occurred along an accelerating curve. Yet the physiology of the human brain has been roughly the same for approximately 300,000 years, so our intellectual capacity hasn't really changed. There have also been plenty of smart people throughout history whose moral compass we might find askew today but which operated well within the normal parameters of their times.

Given this, raw intelligence does not appear to be the deciding factor when it comes to morality. It seems more likely that the dynamics of this “moral arc” is best examined at the societal level rather than the individual level. On top of this, our data points are all from one species with one type of brain and its associated intelligence. It's quite an assumption to project this onto all potential intelligences, whether artificial or not.

Although he stresses that we can't determine the goals of a superintelligent machine, Bostrom speculates that we might be able to determine some of the instrumental goals or values such a machine would have to achieve its ultimate goals. To formalize this he offers his Instrumental Convergence Thesis:

Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by many intelligent agents.

In other words, for a wide range of potential ultimate goals pursued by an intelligent entity, we can identify a number of subgoals that are likely to be pursued. Equally important is the implication that the actions used to achieve these subgoals have the potential to be detrimental to humanity even if the ultimate goals themselves are harmless.

Some of the potential instrumental goals suggested are similar to Omohundro's basic drives, i.e. self-preservation, self-improvement, and what Bostrom refers to as goal-content integrity. This last term is used to label the proposition that an intelligent entity will strive to prevent alterations of its present ultimate goals so as to ensure that those goals are more likely to be achieved by its future self, whatever form that future self takes. It is this proposition, AI Dystopians might argue, that would prevent the paperclip maximizer from changing its goal to just making one paperclip and calling it a day, as Wombat suggested in the last Dialogue.

On the Other Hand

Let’s start with the concept that hardcoded ultimate goals are a key ingredient of general intelligence. This is quite a supposition to base conclusions on, particularly conclusions that point to an existential risk for humanity. In fact, the entire GOUFI model for intelligent systems is questionable at best. There are no biological intelligent systems that function according to the GOUFI model, and even our most successful AI systems today don’t function according to the GOUFI model. It’s a theoretical model with no empirical evidence supporting its use.

As for the Orthogonality Thesis conjecture, a lot is masked by the term in principle in the wording of the conjecture; many things are possible in principle but have close to zero probability in practice. By close to zero, I mean that evaluated over the entire life of the universe, they would still have an infinitesimally small chance of occurring.

It seems likely that on a graph of possible goals versus possible intelligences, data points are going to clump heavily in certain areas and be non-existent in others, due simply to the nature of intelligence and the nature of probability. Of course given that we have a very minimal understanding of the functional properties required for high intelligence and only one sample point, it's very difficult to be definitive as to how clumpy data points on this graph are likely to be — even saying that clumping is likely is, admittedly, extremely speculative. So we can’t really say too much at all, including that the Orthogonality Thesis conjecture is true or even likely.

This brings up a thread of reasoning that's integral to many conjectures of AI Dystopianism and central to the Orthogonality Thesis. This is that we can't know what the goals of these non-human intelligent systems are, given the infinite number of potential goals they could have, a large proportion of which are simply outside the ken of human kind.

But this ignores the fact that we built these machines. They are not alien artifacts that have drifted through space and landed here on earth; these are machines that we built to do things we want them to do. The argument could certainly be made that intelligent machines using the GOUFI model may attempt to pursue their goals in unexpected ways, or that intelligent machines which don't use this model might have unexpected goals.

But if we're to assume that the machines will be using this GOUFI model, it makes no sense to postulate that their ultimate goals, the initial ones they're pursuing at all costs, would be unknowable, irrational, or unexpected. Such speculation results from generalizing a concept to the point of absurdity, an abjuration of reason resulting in discourse as diaphanous as that of dancing angels on pin heads.

Even if we accept the GOUFI model of intelligence and we also accept the Orthogonality Thesis conjecture, we begin to run into logical potholes when asserting that we can determine a set of instrumental goals or values which apply to a wide range of all goals, as stated in the Instrumental Convergence Thesis. As described above, a key component of the Orthogonality Thesis is that instead of considering only what we feel are probable or reasonable goals, we must consider all potential goals in this discussion.

Given this infinite number of potential goals, it makes no sense to state that we can surmise instrumental goals that apply to a wide number of them, as predictions on a wide number out of infinite possibilities still leaves one with an infinite number of unpredictable possibilities. No matter what instrumental goal we single out, there are an infinite number of goals to which it is immaterial. This hints at the general weakness of any discussion that involves infinite possibilities in the real world without modifying that discussion to account for probabilities.

Bostrom proposed that an AGI would have an implacable compulsion not only to achieve its goals but to keep them sacrosanct as well, which he referred to as goal-content integrity. In keeping with his Instrumental Convergence Thesis, this compulsion will inevitably result in particular behavioral drives and potentially many unforeseen and potentially dangerous behaviors. The inevitable result: a lot of bad things.

Interestingly, Bostrom only applies the concept of goal-content integrity to "final goals," stating that "an intelligent agent will of course routinely want to change its subgoals in light of new information and insight." This comment is revealing in that it demonstrates a contradiction at the heart of these fundamental propositions, one arising from trying to weave the fuzzy threads of AI Dystopian speculation into the smooth fabric of logical thought. Goal-content integrity explains the relentless pursuit of ultimate or final goals necessary to justify all the drives detailed by Omohundro and Bostrom, as they can all be traced back to the system's attempting to maximize its ability to maintain and achieve these invariant goals.

What Bostrom is saying is that the instrumental goals of an AGI, which are necessary to actualize its drives and achieve its goals, will shift and change to deal with the unpredictable and ever-changing physical universe. So the subgoals must remain varying while the main goals are immutable and sacrosanct.

But why would this be the case? Why would all the goals not be either immutable or instead vary depending on past events, current circumstances, and reasoned analysis of potential future outcomes?

Science and Philosophy

Omohundro, Yudkowsky, and Bostrom provide their conjectures with no empirical evidence and somewhat sparse logical reasoning. Rather than any proof or even reasoned extrapolation, a few possibilities of how the conjecture might be true are all that they provide. This highlights a problem inherent in much of the reasoning in AI Dystopianism, which is the tendency to simply postulate outcomes that are conceivably possible (sometimes barely or arguably so) rather than outcomes that are definite, likely, or logical extrapolations of empirical evidence.

Although Bostrom refers to each of his conjectures as a thesis, none of them actually fit the definition of a thesis, i.e. the premise or summary of a theory preceding a proof of or evidence for that theory. Like most of the arguments listed in the papers above, they are assumed to be true and to provide a solid basis for the dire outcomes that follow.

In other words, there is a tendency in these foundational documents to trade in imagination rather than analytical reasoning. Rather than employing the scientific method, they employ philosophical speculation. Yet it is the scientific method that allows us to work around our cognitive shortcomings and leverage our knowledge of the universe around us.

Wikipedia defines it as follows:

[The Scientific Method] involves careful observation, applying rigorous skepticism about what is observed, given that cognitive assumptions can distort how one interprets the observation. It involves formulating hypotheses, via induction, based on such observations; experimental and measurement-based testing of deductions drawn from the hypotheses; and refinement (or elimination) of the hypotheses based on the experimental findings.

None of this really applies to AI Dystopianism, which I would argue is an idealogical rather than scientific viewpoint. These conjectures, and much of the speculation underlying AI Dystopianism, are simply not scientific in nature. The thinking suffers from the Unproven Basis fallacy, in that conjectures are made and then significant extrapolations are based on them without adequately showing the original conjectures to be truthful or even reasonable.

Coherence and Contradiction

AI Dystopian conjectures frequently offer contradictory statements for AGI systems and superintelligent entities: goals are either locked into place and maintained at all costs or unpredictable due to the infinite number of potential goals an intelligent entity might have. Subgoals are either possible to predict because a manageable subset of them would be pursued by many intelligent agents or unpredictable because they're chosen by non-human intelligences to deal with an ever-shifting set of circumstances.

This ambiguity leads to many questions.

For example, it brings us back to the question poses in the last Dialogue: can an entity with an unchanging and unchangeable set of ultimate goals in an ever changing universe truly be considered a generally intelligent entity?

And are we really incapable of judging what is a more or less likely goal for a non-human intelligent entity or can we assume that any rational intelligence would likely not have irrational goals? Does this allow us to pare down the infinite ocean of potential goals into manageable pools of more and less likely goals, of rational and irrational goals?

These questions lead to the next topic of discussion, which involves what it means to be rational and explores whether we as biased humans can objectively state whether a particular goal is rational or irrational.

Paperclips and the Nature of Intelligence

DK — Sat, 24 Jun 2023 14:31:03 GMT

When Thought Experiments Go Wrong

In this continuation of the AGI Dialogues series, Wombat, Llama, and Meerkat discuss a well-known thought experiment used by AI Dystopians to highlight some of their concerns about artificial general intelligence.

As mentioned in Part I of this Dialogue series, the concepts, scenarios, and thought experiments discussed are all taken from actual concepts, scenarios, and thought experiments proposed by leading voices in the AGI discussion (and many of these original proposals are linked to below). In this dialogue series, the participants must actually defend these ideas to others who may not agree, and those who disagree must actually provide defensible reasons for why they disagree.

My goal with this series of dialogues is to provide a more rounded contribution to the discussion for those that may not have heard these ideas or who have only heard them unchallenged.

Meerkat
You both need to consider the paperclip maximizer, the thought experiment granddaddy of all AGI thought experiments. Suppose we create a machine intelligence with the goal of creating paperclips. Its utility function, the algorithm that controls its decision making and actions, is continually being revised to optimize its ability to create paperclips.

Llama
But why would we want it to do something as mundane as creating paperclips?

Meerkat
It's a thought experiment. Go with it. Making paperclips is just an arbitrary, non-controversial, innocuous goal with no obvious dangerous undercurrents. So the one goal of this paperclip maximizer is create paperclips. That's its final goal, and it's not very scary.

But to achieve that goal, it's going to have come up with subgoals — preliminary goals instrumental in achieving its final goal. While it’s not necessarily possible to predict all potential instrumental goals, there are some that most intelligent entities are likely to employ. In other words, there will likely be an instrumental convergence on some set of subgoals for all intelligent entities trying to achieve their goals. And it’s these convergent subgoals that are likely to cause problems for us.

Efficiency is one of them, because the more efficient it is, the more paperclips it can create. So the paperclip maximizer will realize that the smarter it is, the more efficiently it's going to be able to make paperclips. This will lead to an intelligence explosion as it recursively redesigns itself to be smarter and smarter.

Wombat
Wouldn't it be more efficient to redesign itself to be satisfied with just one paperclip and then chill out?

Meerkat
That's not the point.

Wombat
Sure, but when you have some ultimate goal like that and you're worried about efficiency, the most efficient thing is to make your goal simpler. Suppose I really crave some Häagen-Dazs rum raisin ice cream, but all I have in the fridge is a half-eaten Snickers bar. I could pause my game, put pants on, take the car down to the 7/11, buy the ice cream, come back home, and eat it. Dude — that is a major hassle.

Or I could just say screw it and decide to be satisfied with half a Snickers bar. Done deal.

Meerkat
Let's just assume that the goal is making paperclips, and that changing the goal to only want one paperclip is itself a refutation of that goal and therefore undesirable.

Llama
That's a pretty big assumption.

Wombat
Yeah, didn’t you say that instrumental goals may not be directly connected to a final goal? So it still seems that the most efficient instrumental goal to achieve the final goal is to reprogram the final goal to being satisfied with one paperclip.

Meerkat
I still think that just undermines the goal and therefore can’t be considered a path to achieving the goal.

So anyway, this paperclip maximizer is now superintelligent and it needs to keep making more paperclips to achieve its goal. To do that it will need more material — it'll need atoms to convert into paperclips. It'll begin to consume all resources on Earth to convert to paperclips. And guess what — people are made out of atoms, too. It doesn't hate humans or want to destroy them. But although it has no feelings one way or the other about humans, it will realize that it can further maximize its goal by converting you to paperclips.

Wombat
OK, hold on a second here. First, I'll even give you the benefit of the doubt that these paperclips can be made out of atoms that pretty clearly would make crappy paperclips. And I'll even put aside ethical considerations on the part of the superintelligent AI for the time being.

But if this thing is bent on efficiency and maximizing its ability to make paperclips, then why would it waste time with the thin film of biology on this very moderately-sized rock we're on, especially when there are an infinite or nearly infinite number of other rocks across the universe that not only have a lot better atoms for making paperclips but also have atoms that don't fight back when you crank up the paperclip-making process on them?

Meerkat
Humans are more readily available. If it can make a few more paperclips at the beginning and snuff out a potential adversary early on, why wouldn't it do that?

Wombat
Humans are 60% water! It would have to take up valuable time developing water-to-paperclip technology.

Llama
And aren’t humans potential adversaries only because the paperclip maximizer is trying to make them into paperclips?

Meerkat
Not necessarily, Llama, and I’ll get to why that is in a second.

And Wombat, assume it would just make paperclips out of the useful atoms and disregard the rest as waste.

Wombat
Look, instead of battling entrenched and already prepared humans and then converting them into paperclips, why not just spend your time developing technology for making really good spaceships? The rest of the Solar System has vastly more paperclip-ready atoms than what's available here on Earth. And then there's the rest of the universe.

I mean, why waste time dealing with watery paperclips here when you have potentially infinitely more and better atoms to exploit before the heat death of the universe pulls the plug on the whole thing?

Subscribe now

Meerkat
To maximize something means you want to use every available resource to meet your goal. The AI will have an unbounded preference for resources. No matter how little value a particular source presents or how hard it is to procure the resource from that source, if it can make one more paperclip with it than without it, that source will be exploited.

Logically speaking, if it's possible to make one more paperclip, and you're trying to maximize your goal of making paperclips, then you're going to want to make that one more paperclip.

Llama
Wouldn’t it also, however, realize that even a superintelligent entity is susceptible to unforeseen events, events that may be catastrophic to its paperclip-making capability? It seems like it would want to make sure it made as many paperclips as it could as soon as possible, and that means going as quickly as possible to the best source of the most paperclip-compatible material.

Wombat
And even if it doesn’t get snuffed out early, it’s got to somehow get to every single other more useful atom in the universe before the universe reaches thermodynamic equilibrium and the paperclip maximizer is stuck floating around inertly for all eternity with a bunch of useless paperclips.

Just think how annoyed it’s going to be when it runs out of time before it gets to the last few galaxies because it wasted time on trying to turn humans into paperclips.

Llama
I'd also just like to interject that we're talking about a superintelligent machine that has the technical capability to destroy human civilization and yet has the unshakeable desire to reconfigure humans beings and the rest of the universe into simple office products. I think the premise is somewhat logically challenged right from the get-go.

Wombat
Yeah, I gotta say: using the atoms in humans to make paperclips makes even less sense than using humans as batteries like they did in the Matrix movies. So congrats — your scenario doesn't quite meet the rigorous science standards of a 90s sci-fi action movie.

Meerkat
It's a thought experiment to make a point. Look, even if it doesn't immediately make everyone into paperclips, the likelihood of its doing so in short order is pretty high. Suppose it started off by constructing near lightspeed capable spaceships so that it can get to all the other planets and star systems and convert as much matter as possible into paperclips.

Wombat
Why not just build faster than light drives or warp drives?

Meerkat
This is science, not science fiction.

Now this is a rough approximation, but there are about 4x10^20 stars that can be reached before they go over the cosmological horizon due to the expansion of the universe. But if the paperclip maximizer converts our solar system to paperclips, including us, that's 4x10^20+1 stars it can convert, and, as I've already mentioned, it will have a preference to exploit a resource rather than not exploit it. Our sun weighs about 10^33 grams, so that's an extra 10^34 paperclips.

Now, getting back to Llama’s point about humans being adversaries, even if it doesn't convert humans to paperclips, it still might not work out great for us. It'll likely have a non-human-valuing preference framework, so it won't really care what happens to us one way or another. And unless it has a sharply time-discounted utility function, it's going to want to use solar system resources to construct high-speed interstellar probes ASAP. This in and of itself will likely wipe us out, since using up the solar system to construct ships will likely cause humanity's destruction as a side-effect. Sooner or later humans are going to realize this and become adversaries.

Wombat
OK, let's just —

Meerkat
Now in the early days, before it's cracked the protein folding problem and built its own self-replicating nanotechnology, we could potentially threaten its existence. In the game-theoretic position it would be in, where it would prefer we either help it or ignore it, it would most likely want to modify our behavior. For example, it could tell us it'll be nice to us or promise us an afterlife —

Wombat
Nerd down, dude! Let's just stop right there and examine the validity of your initial premise before we start calculating the paperclip mass-equivalence of Betelgeuse.

Like why anyone in their right mind would design an AGI system that in any way matches the characteristics of your paperclip maximizer. Or why any AGI system smart enough to successfully battle the humans who built it and then figure out how to turn them into paperclips, before or after building interstellar spacecraft to turn other star systems into paperclips, would not be smart enough to realize the utter pointlessness of turning all matter in the universe into paperclips.

Meerkat
It's a thought experiment.

Wombat
It's a thoughtless experiment.

Llama
I think the point Wombat is making is that the utility of a thought experiment is questionable if it's overly simplistic or disregards invariable realities. You’re falling into a combo of the Ludic and Reification fallacies. There are so many real world and logical constraints on every aspect of what's being discussed here that the whole premise pretty much falls apart. Let's just start with why anyone would build such a single-minded and rigid machine in the first place. That seems to be a shining example of the Bad Engineer fallacy.

Meerkat
I think that although it is a simplification, it's not an oversimplification. The main idea is that a) it's not always obvious what actions an intelligent entity will engage in to achieve its goals, even if those goals in and of themselves seem benign, and b) we shouldn't assume that just because something is intelligent and technologically advanced it will be empathetic, compassionate, or otherwise have any inclination to consider our well-being. I think those are the realities we have to consider.

Llama
Fair enough, but I think it fails even in those two points. I think the premise fails on its foundations. It seems to presuppose certain parameters of intelligence for which there is no supporting evidence — an example of the Unproven Basis fallacy. In fact, all evidence directly contradicts such a model of intelligence.

Meerkat
How so? Intelligence is nothing more than the ability to achieve goals in a wide range of environments, and the goals are not directly correlated with the level of intelligence. In other words, more or less any level of intelligence could in principle be combined with more or less any final goal, and that includes a superintelligence maximizing the manufacturing of paperclips. That’s the Orthogonality Thesis, and pretty much everything follows from there.

Llama
Well, I'm not so sure about that assertion, but let's start with your initial definition. I think that definition of intelligence itself is highly lacking in that there seem to be many more aspects to consider for what we would label intelligence. A completely mindless machine can be quite good at achieving its goal, whether that machine is a calculator, a virus, or a mosquito. An ant is very good at achieving its goals in a wide range of environments — it just doesn't aim very high when it comes to those goals. None of these are particularly intelligent in any general sense or even in any sense but the most trivial.

Meerkat
You're oversimplifying my definition of intelligence.

Llama
I don't think so. In fact, your paperclip maximizer seems to have more in common with a mosquito, which just uses a simple methodology to instinctively vector in towards a blood supply it senses, than it does with an adaptive, problem-solving human being. On top of that, your statement that there's no direct correlation between intelligence level and final goals seems somewhat nonsensical.

Meerkat
What I mean by that is that there's a tendency for people to think of AGI systems in anthropomorphic terms. For example, a common thought is that as an entity becomes more intelligent, it becomes more compassionate. But the mindset of an AGI system would be totally alien to ours, as we're evolved biology and it would be constructed technology. It may not be rational in all domains that we feel are important, at least not what we'd consider rational. It may be completely rational in every aspect of paperclip production but have major gaps in other areas that are irrelevant to that goal. Like morality, for instance.

There's no reason to suppose that a superintelligent machine would in any way share our motivations, our belief systems, our emotions, our behaviors, or our goals.

Llama
But as a definitive statement, I just don’t think that this Orthogonality Thesis holds up. If that were true, then a dog could potentially have the goal of designing a skyscraper or a cow could potentially have the goal of learning calculus. Given that these are unlikely goals for dogs and cows, it seems that, in fact, there is some correlation between intelligence and goals.

Meerkat
OK, well I guess it's more that any human-level or better intelligence has a wide range of potential goals and it's not possible to rule any out.

Llama
Ok, so we agree then that there is in fact some degree of correlation. We agree that there is a very small probability of a dog yearning to be an architect and an even smaller probability of that dog becoming an architect. Those are a parallel rather than an orthogonal pairing of intelligence level and goal, meaning that they never intersect.

I think that there's an equivalently minimal probability that a superintelligent entity would steadfastly maintain any singular goal in the way your paperclip maximizer does, particularly a goal that is objectively pointless. This goes back to not only why anyone would design such a system, but also whether the nature of intelligence itself would negate the possibility that such a system could exist.

Meerkat
Again, the goal could be any goal that seems on its surface to lack any negative intent.

Llama
Sure, but the goal isn't the issue. The nature of intelligence is the issue. Why would we design a system that has a single, all-encompassing goal or group of goals, one that doesn't allow other factors to modify that goal and that has no ability to self-reflect on the utility of that goal? That's certainly not how our intelligence works.

I'd argue that the inability to weigh historical and contemporary data as well as environmental and other time-variant data into adjusting the parameters of its goals and motivations makes the paperclip maximizer, by definition, an unintelligent system or least not a system possessing general intelligence. It's closer to an example of extreme machine-based savant syndrome.

Wombat
Yeah, I think without the ability to self-reflect and adjust one's motivations and goals depending on circumstances, you’re just not talking about intelligence anymore. So your paperclip maximizer just doesn’t qualify as a system possessing general intelligence.

The Technological Singularity, AI Utopianism, and Faith

DK — Sat, 17 Jun 2023 14:30:51 GMT

Fantasy Dialogues and the Nature of Reality

I’ve noticed over the years that discussions of AGI and of future scenarios in which it results in Utopia or (more often) Dystopia are frequently extremely one-sided. If there is actually an attempt at dialogue, the participants frequently talk past one another or, more often, simply ignore the other side completely. So, a continuing element of this blog will be occasional dialogues in which the participants actually engage with each other.

In keeping with the fantastical nature of this concept, the participants in these dialogues will be a wombat, a llama, and a meerkat. They are three intelligent friends who frequently disagree yet are still happy to engage with each other, and, coincidentally, they are named Wombat, Llama, and Meerkat, respectively.

The concepts, scenarios, and thought experiments they discuss are all taken from actual concepts, scenarios, and thought experiments proposed by leading voices in the AGI discussion. The difference is that these three must actually defend those ideas to others who may not agree, and those who disagree with them must actually provide defensible reasons for why they disagree.

My goal with this series of dialogues is to provide a more rounded contribution to the discussion for those that may not have heard these ideas or who have only heard them unchallenged.

Wombat
As far as I'm concerned, the technological singularity can't come soon enough. Eternal youth and vigor, intelligence enhancement, nanotech fabricators for whatever you want, no need to work, no more traffic, and robot housekeepers. Networked into a new reality with my AGI besties. Abundance up the wazoo. Sign me up!

Llama
That, my friend, is a pipe dream based on geek-infused wishful thinking.

Wombat
On the contrary. It's based on facts, physics, and the well-established trend of exponential technological progress. Kurzweil calls it the Law of Accelerating Returns. So you can go stew in your own cynical juices while I strap on a neural interface cap ten years from now and jack into a full-sensory Tahiti beach party sim from my couch.

Llama
Please — exponential growth in computer technology does not mean exponential growth in technology overall. That isn't a real law, like the law of gravity. There hasn't been exponential growth in cars or planes or rockets. My toaster works the same as the one my mom used when I was a kid except it has more buttons and won't last as long. As a matter of fact, we're already starting to reach the physical limits of packing logic gates into computer chips, so even that's going to level off.

Wombat
First off, the current way we’re creating computer chips may be reaching some physical limits, but I’m sure there are other types and configurations of chips that will continue the trend. And you're misinterpreting the Law of Accelerating Returns. You're being too granular. It doesn't specify any particular technology is going to continue on an exponential curve. It just states that when you examine the history of civilization, you see that the rate of human technological progress overall, the rate of paradigm shifts, is increasing along an exponential curve.

Llama
I still think that's debatable. Just because we've been able to make denser and denser computer chips or use CRISPR to swap out a few strands of DNA and make glow in the dark bunnies doesn't mean that scientific discovery is going at an exponential rate. It takes time and experimentation and data collection and building knowledge and just a lot of random chance to discover things. Even when it comes to computers, hardware technology has maybe grown exponentially but software technology sure hasn't.

Wombat
You're still missing the point. What Kurzweil's saying is that progress itself is accelerating on the whole, and that while there may be dips and bumps and unevenness here or there, when you zoom out, you realize that technological progress overall has been inherently exponential in nature.

And by the way, it does in fact hold for software as well. The software curve simply lags behind the hardware curve. Just as one example, the advancements of hardware and the Internet have enabled incredible advances in machine learning. But pretty much all aspects of software development are eons ahead of where they were a short time ago, including our ability to effectively and efficiently develop and manage massive networked systems of ever increasing complexity.

Meerkat
I actually agree mostly with Wombat, but I'm not so sure about how effectively we're able to manage all that complexity. Computer systems are getting more and more prone to failures, and I think it's because we're reaching the complexity limits of what we can reliably design and build. We can build these systems but they're growing more unstable. That doesn't bode well for your brain implants or nanorobots or AGI.

Subscribe now

Wombat
Sure, it's a constant challenge to maintain reliability while increasing complexity. In fact, the complexity of systems will probably always be a little ahead of reliability, at least at the bleeding edge. But the complexity of our systems has grown dramatically, and we've managed to do pretty well. I mean, consider all the software that keeps our modern society running. It does a good job at keeping all the balls in the air, and the software for critical systems works amazingly well.

There are always going to be some failures in any system. The same is true in biological systems, and they've been pretty successful so far.

Llama
Look, if you zoom out far enough all human progress is just a few scattered data points straddling a straight line. Not enough data points to determine a curve, and the points we have aren't very accurate. And I'll say this: when you're living on the graph, it's pretty hard to judge whether it's linear or exponential.

Wombat
Maybe so, but I think we've all felt that things are changing rapidly around us.

Llama
Yeah, but every new generation says that.

Wombat
Our practical scientific knowledge is drastically more expansive than it was a century ago. If something is physically possible, then it's likely we'll be able to do it soon. Tomorrow's toaster is going to be a molecular assembler that pops out a nicely toasted English muffin with butter already melted on it, and all you'll have to do is pop in some carbon and other assorted element cartridges every few months.

Meerkat
Until someone hacks into your toaster and creates self-replicating gray goo that uses your house and you as raw materials.

Wombat
Come on — creating self-replicating nanobots that can operate and maneuver in open air is not a simple task, and no doubt the toaster would have safeguards against it anyway. The toaster itself doesn't need to be self-replicating to work. You'd likely arrange the assemblers as an assembly line, where one tiny group of assemblers creates components that it passes off to a group of slightly larger assemblers which makes slightly larger components and so on until you get a macro scale object popping out of the "toaster."

Llama
You make it sound like we'll be ordering a molecular toaster from Amazon next year. That's just not going to happen. The physical environment of the nanoscale is nothing like the macro environment in which we're able to build machines. At the nanoscale, water is like molasses, Brownian motion makes everything continuously wiggle around, and van der Waals forces stick nearby molecules together whether you want them stuck or not.

Protein molecules, which are very likely critical to any nanoscale device, are incredibly sticky and hard to control. Everything is constantly being bombarded, stretched, and bent.

Wombat
Well, you're making it sound like nanomachines are impossible creations, but that's provably wrong. All of biology is proof that you're wrong, or at least that the problems you listed can be overcome. Biology is full of molecular assemblers, with nanoscale motors that convert chemical energy into mechanical motion and membranes with active ion channels that sort molecules.

We have software controlled manufacturing in protein synthesis, where molecular machines called ribosomes read information from strands of messenger RNA and use that code to create sequences of amino acids. The amino acid sequences define the 3-dimensional structures and functions of proteins. We have the proof of concept for nanoscale molecular assemblers right there within our own bodies.

Llama
I'm just saying that it's really hard to get things working in the real world. You can describe artificial nanomachines and even model them in software, but the physical world is a harsh mistress. Atoms tend to misbehave and rearrange themselves, the wrong atoms get caught up in the machinery and damage it, friction and stickiness cause massive stability problems.

And like I said, everything is in constant, bumper car-like motion. It's like making a clock and its gears out of rubber then letting it tumble around in a washing machine filled with maple syrup and bits of duct tape. Your clock's going to keep crappy time.

Wombat
And yet here we are arguing about it, two biological entities chock full of working nanomachines. Apparently, there are ways around the problems. Now all we need is to make some updated models of the nanomachines, let them loose inside us, and they can clean out our arteries, clear away our zombie cells, eat any cancerous cells, and in general keep us young and healthy.

Meerkat
Ponce de León has entered the room.

Wombat
First of all, it's apocryphal that Ponce de León was searching for the Fountain of Youth. Second of all, you are under no obligation to preserve your youth and vigor if old age and death is your bag.

Llama
I'm with Meerkat on this. Immortality is simply the stuff of myth and religion.

Wombat
I didn't say anything about immortality. I said the end of aging and age related diseases. You can still get squashed by a bus. And what I'm talking about doesn't necessarily require nanotechnology — that's just one route. We may come up with a biological route first. Immortality, however, would require the ability to upload your consciousness into some sort of device and hopefully the ability to download it back into a new biological or robotic body should the need arise.

Llama
OK, slow down. Look, we all want to avoid death. I get it. But wishing something were possible is not the same as its actually being possible. We all know deep down that this is a fantasy, right?

Wombat
So first you Straw Man me with the immortality label, and now you're slapping me with Argument From Incredulity and Appeal to Ignorance fallacies. You might as well have told the Wright Brothers in 1902 that flying machines were nothing more than Greek myth and Renaissance dreams only to have them use science and engineering to literally dump cold water all over you from the plane they invent the following year.

Llama
I think eternal youth is a taller order than powered flight. There's a reason people say that the technological singularity is simply the rapture for nerds.

Wombat
That may make a snappy headline, but it's simply an Ad Hominem attack on nerds rather than a disputation of the ideas of singularitarians such as myself. The technological singularity and the components that are likely to be a part of it or result from it are based on science.

Llama
Seems more like they're based on faith to me.

Wombat
Only if you change the definition of faith. Sure, I'm optimistic and that's part of why I think the likelihood of the singularity is pretty good, and why I'm hopeful that what it brings about will be positive for humanity. I could be wrong. And maybe being able to admit that is the biggest difference between having confidence in science and simply having faith.

Llama
Being optimistic is one thing, but being delusional is quite another. Perhaps the dividing line is when you think that fantastical things will not only happen but they will actually happen to you. Maybe all these things will come about. Maybe we'll figure out how to make molecular assemblers and develop eternal youth, or maybe we'll invent AGI systems that will do it for us and we'll merge with them and transcend our biological roots.

But it's not happening tomorrow and it's not happening anytime soon. Certainly not in time for anyone walking around now.

Wombat
When you're at the knee of an exponential curve, it seems linear.

Llama
But you're still assuming that technological progress is exponential. Maybe it's not. Maybe it's s-shaped and you get a little spurt, and then it flattens out again. Things going to infinity rarely happen in the real world. Maybe it exists at the center of a black hole, but nothing in biology goes to infinity. Nothing in society goes to infinity. Nothing in the classical workings of the universe goes to infinity. There are always brakes on growth.

Wombat
Forget about infinity. It's a singularity in the sense that you can't predict or even truly contemplate what existence will be like after it. It's like being on one side of a black hole — there's no way to know what's on the other side of that hole.

Meerkat
Assuming there is another side.

Wombat
Whatever. It's just a name. Don't attack the concept just because you don't like the name. I mean, don't get me started on artificial intelligence. Talk about poorly conceived names.

Llama
Look, there isn't just the problem of developing technology that's way beyond anything we know how to do. It's also the fact that society has an inertia that generates drag on changes which rub people the wrong way. Go ask any random ten people if they want little robots roaming around inside their bodies. Ask them how they feel about sitting around idly while artificial intelligence takes over and runs the show. Most will be horrified rather than overjoyed.

Wombat
Artificial intelligence and nanotechnology are just tools. When we all see the benefits of these technologies, we'll accept them and eventually integrate them into who we are just like we do now with our smartphones.

Meerkat
Are you keeping in mind the fact that artificial intelligence by definition will have a mind of its own? Maybe it won't want to integrate with us.

Wombat
Here we go…

Meerkat
You can scoff, Wombat, but I have one word for you: paperclips.

To be continued…