A non-technical introduction to AI risk

In the summer of 2008, experts attending the Global Catastrophic Risk Conference assigned a 5% probability to the human species’ going extinct due to “superintelligent AI” by the year 2100. New organizations, like the Centre for the Study of Existential Risk and the Machine Intelligence Research Institute, are springing up to face the challenge of an AI apocalypse. But what is artificial intelligence, and why do people think it’s dangerous?

As it turns out, studying AI risk is useful for gaining a deeper understanding of philosophy of mind and ethics, and a lot of the general theses are accessible to non-experts. So I’ve gathered here a list of short, accessible, informal articles, mostly written by Eliezer Yudkowsky, to serve as a philosophical crash course on the topic. The first half will focus on what makes something intelligent, and what an Artificial General Intelligence is. The second half will focus on what makes such an intelligence ‘friendly‘ — that is, safe and useful — and why this matters.


Part I. Building intelligence.

An artificial intelligence is any program or machine that can autonomously and efficiently complete a complex task, like Google Maps, or a xerox machine. One of the largest obstacles to assessing AI risk is overcoming anthropomorphism, the tendency to treat non-humans as though they were quite human-like. Because AIs have complex goals and behaviors, it’s especially difficult not to think of them as people. Having a better understanding of where human intelligence comes from, and how it differs from other complex processes, is an important first step in approaching this challenge with fresh eyes.

1. Power of Intelligence. Why is intelligence important?

2. Ghosts in the Machine. Is building an intelligence from scratch like talking to a person?

3. Artificial Addition. What can we conclude about the nature of intelligence from the fact that we don’t yet understand it?

4. Adaptation-Executers, not Fitness-Maximizers. How do human goals relate to the ‘goals’ of evolution?

5. The Blue-Minimizing Robot. What are the shortcomings of thinking of things as ‘agents’, ‘intelligences’, or ‘optimizers’ with defined values/goals/preferences?

Part II. Intelligence explosion.

Forecasters are worried about Artificial General Intelligence (AGI), an AI that, like a human, can achieve a wide variety of different complex aims. An AGI could think faster than a human, making it better at building new and improved AGI — which would be better still at designing AGI. As this snowballed, AGI would improve itself faster and faster, become increasingly unpredictable and powerful as its design changed. The worry is that we’ll figure out how to make self-improving AGI before we figure out how to safety-proof every link in this chain of AGI-built AGIs.

6. Optimization and the Singularity. What is optimization? As optimization processes, how do evolution, humans, and self-modifying AGI differ?

7. Efficient Cross-Domain Optimization. What is intelligence?

8. The Design Space of Minds-In-General. What else is universally true of intelligences?

9. Plenty of Room Above Us. Why should we expect self-improving AGI to quickly become superintelligent?

Part III. AI risk.

In the Prisoner’s Dilemma, it’s better for both players to cooperate than for both to defect; and we have a natural disdain for human defectors. But an AGI is not a human; it’s just a process that increases its own ability to produce complex, low-probability situations. It doesn’t necessarily experience joy or suffering, doesn’t necessarily possess consciousness or personhood. When we treat it like a human, we not only unduly weight its own artificial ‘preferences’ over real human preferences, but also mistakenly assume that an AGI is motivated by human-like thoughts and emotions. This makes us reliably underestimate the risk involved in engineering an intelligence explosion.

10. The True Prisoner’s Dilemma. What kind of jerk would Defect even knowing the other side Cooperated?

11. Basic AI drives. Why are AGIs dangerous even when they’re indifferent to us?

12. Anthropomorphic Optimism. Why do we think things we hope happen are likelier?

13. The Hidden Complexity of Wishes. How hard is it to directly program an alien intelligence to enact my values?

14. Magical Categories. How hard is it to program an alien intelligence to reconstruct my values from observed patterns?

15. The AI Problem, with Solutions. How hard is it to give AGI predictable values of any sort? More generally, why does AGI risk matter so much?

Part IV. Ends.

A superintelligence has the potential not only to do great harm, but also to greatly benefit humanity. If we want to make sure that whatever AGIs people make respect human values, then we need a better understanding of what those values actually are. Keeping our goals in mind will also make it less likely that we’ll despair of solving the Friendliness problem. The task looks difficult, but we have no way of knowing how hard it will end up being until we’ve invested more resources into safety research. Keeping in mind how much we have to gain, and to lose, advises against both cynicism and complacency.

16. Could Anything Be Right? What do we mean by ‘good’, or ‘valuable’, or ‘moral’?

17. Morality as Fixed Computation. Is it enough to have an AGI improve the fit between my preferences and the world?

18. Serious Stories. What would a true utopia be like?

19. Value is Fragile. If we just sit back and let the universe do its thing, will it still produce value? If we don’t take charge of our future, won’t it still turn out interesting and beautiful on some deeper level?

20. The Gift We Give To Tomorrow. In explaining value, are we explaining it away? Are we making our goals less important?

In conclusion, a summary of the core argument: Five theses, two lemmas, and a couple of strategic implications.


If you’re convinced, MIRI has put together a list of ways you can get involved in promoting AI safety research. You can also share this post and start conversations about it, to put the issue on more people’s radars. If you want to read on, check out the more in-depth articles below.


Further reading


Evolution: Six myths

MYTH 1: Evolution is just a theory, not a fact.

When I say I have a theory, it means that I have a guess, a conjecture. But when a scientist says she has a theory, it means that she has a working explanation for a large set of facts. When we confuse these two senses of ‘theory’, we can misunderstand the scientific standing of the ‘theory’ of evolution.

In science, a theory is one of the most sturdy and well-tested ideas, rather than one of the least. Strictly speaking, a scientific theory can never become a ‘fact’ no matter how well-supported it is, because a theory is an overarching explanation rather than a mere observation. Thus, the idea that matter is made of atoms is still the ‘atomic theory’, and the idea that microorganisms cause disease is still the ‘germ theory’.

Because theories make predictions about what will happen in the future, they can be tested and refined over time. Around the 1930s, Darwin’s original theory was replaced by the modern synthetic theory. This “neo-Darwinian” theory incorporated Gregor Mendel’s account of how offspring inherit traits. Mendelian genetics has helped produce the scientific definition of the actual observed process of evolution — as ‘change in a population’s inherited traits’. A common source of confusion is mixing up the physical process, ‘evolution’, with ‘the theory of evolution’ (with explains the process). The process — which can be seen every time children aren’t exact clones of their parents! — can be called a ‘fact’ in the strict sense, whereas the theory of evolution is only a ‘fact’ in the looser sense of being ‘something we know is true’.

So what does the theory actually tell us?

In biology, evolution is the change in a population’s inheritable traits from generation to generation. It boils down to 4 core ideas:

  1. Heredity. Parents pass on their traits to offspring.
  2. Variation. Offspring differ slightly from their parents, and from each other.
  3. Fitness. Some of these differences are more helpful for reproducing than others.
  4. Selection. Offspring with more helpful traits will in turn have more offspring, making the traits more common in the population.

Over time, this simple process of small incremental changes can have dramatic results. As traits become more common or rare in the population over millions of years, a species gradually changes, either randomly or by the environment’s selection of certain helpful traits, into a new species — or branches off into several. This process is called speciation.

MYTH 2: Evolution teaches that we should live by ‘survival of the fittest’.

‘Survival of the fit’ is a better characterization of Darwin’s theory of selection than ‘survival of the fittest’, a phrase coined by the social theorist Herbert Spencer. Selection simply states that whatever organisms ‘fit’ their environments will survive. When resources are limited, competition to survive certainly plays a role — but cooperation does too.

The idea of ‘survival of the fittest’ has been used to justify Social ‘Darwinism’, which amounts to a philosophy of ‘every man for himself’. However, a number of biological misunderstandings underlie the notion that pure selfishness helps the group or the individual. First, even if this were true in nature, that would not automatically make it a good thing — germs cause disease in nature, yet that doesn’t mean we should try to make ourselves sick. Do we want to emulate nature’s brutality, or mitigate it?

Second, the ‘fittest’ species are often the best cooperators — symbiotes, colonies of insects and bacteria, schools of fish, flocks of birds, herds and packs of mammals, and of course societies of humans. Moreover, most ‘weaknesses’ are not genetic, nor so severe that they make the individual unable to contribute to society.

‘Fitness’ also changes depending on the environment. There is no context-free measure of fitness, and what is ‘weak’ today may be ‘strong’ tomorrow. Mammals were ‘weak’ when dinosaurs dominated the planet, but strong afterwards. Which brings us to the next myth…

MYTH 3: Evolution is progress.

When we speak of something’s ‘evolving’, we usually mean that it is improving. But in biology, this is not the case. Most evolution is neutral — organisms simply change randomly, by mutation and other processes, without even changing in fitness. And although evolution can never be harmful in the short term, since a harmful trait by definition won’t be selected, the problem is that evolution only cares about the short term. Although in the long run small evolutionary improvements can add up to massive advantages, it’s also possible for short-sighted, immediate benefits to evolve which doom a species in the long run. This is especially common if the environment changes.

The illusion of progress is created because all the evolutionary ‘dead ends’ tend to end up, well, dead — dead as the dodo. But the idea that evolution has any long-term ‘goals’ in mind derives from us not noticing two things. First, we don’t notice how meandering evolution is — an animal might become slightly larger one century, slightly smaller the next. And the reason neither process amounts to ‘evolving backward’ is because neither process is ‘evolving forward’ either — all hereditary change is evolution, regardless of ‘direction’.

Second, although we enjoy thinking of ourselves as the ‘goal’ of evolution, we don’t think about the hundreds of millions of other species that were perfectly happy evolving into organisms radically different from us. Bacteria make up the majority of life’s diversity; if intelligent bipeds were the aim of all evolution, we would expect them to have evolved thousands of times, not just once.

MYTH 4: We evolved from monkeys.

You may have heard the question asked: ‘If humans descended from monkeys, why are there still monkeys?’ Next time you hear this, feel free to reply: ‘If Australians descended from Europeans, why are there still Europeans?’

Biologists have never claimed that humans evolved from monkeys. Biologists do believe that humans and monkeys are related — but as cousins, not parent and child.

But then, biologists also claim that all life is distantly related. This theory, common descent, is the real shocker: You’re not only related to monkeys, but to bananas as well! This is based on the fact that all life shares astonishing molecular and anatomical simliarities, and these commonalities seem to ‘cluster’ around otherwise-similar species, like lions and tigers. Just as DNA tests make it possible to determine how closely related two human beings are, so too, by the same principle, do they allow us to test how closely related two different species are.

In the case of humans, the molecular evidence suggests, not that we descended from monkeys, but that we shared a common ancestor with them tens of millions of years ago. This ancestor was neither a modern monkey nor a human, but a now-extinct primate. Humans and monkeys have both evolved a great deal since that time. We’ve just evolved in very different ways.

It should also be noted that we are much closer to the other apes than to true ‘monkeys’ (which have longer tails). Humans are classified in the great ape family, the hominids. This makes our closest living cousins the chimpanzee, gorilla, orangutan, and gibbon — but we didn’t evolve from them, any more than they evolved from us.

MYTH 5: Evolution is random.

It’s sometimes suggested that evolution is too ‘blind’ and ‘random’ to result in complicated structures. But natural selection is only ‘random’ in the sense that physical processes like gravity are ‘random’. Although the genetic differences between organisms derive in part from random mutations, natural selection nonrandomly ‘filters’ those differences based on how well they help the organism survive and reproduce in its environment. The overall process of evolution, therefore, isn’t simply random: Species change in particular ways for particular reasons, such as because of a new predator in the region, or for that matter because of the absence of a predator.

A related myth is the notion that evolutionary theory claims ‘life arose by chance’. This is not an aspect of the theory of evolution, which only describes how life changes after it has already originated. Instead, this is relevant to the study of abiogenesis, the origin of life.

MYTH 6: We still haven’t found a ‘missing link’.

It’s not always clear what this fabled ‘missing link’ is supposed to be, as thousands of fossils of early humans and hominids have been discovered. The problem is that every time a new fossil is found that fills a ‘gap’ in the evolutionary record, it just creates two new gaps — one right before the fossil species, and one right after.

Transitional fossils linking major groups of organisms are also abundant in the record. One of the most famous dates back to Darwin’s day: Archaeopteryx, a proto-bird with feathers, teeth, and clawed fingers. More recent examples include Tiktaalik, a fish with primitive limbs (including wrists) and a neck, predating the amphibians.

More to the point, the central lesson evolution has to teach us is that every organism is a ‘link’ — all life is connected, and every organism that has offspring is equally ‘transitional’, because life is constantly changing. The change is gradual, certainly, and seems minute on a human time scale — but one of the profoundest lessons science has to offer is that drops of water, given time, can hollow a stone.

For more information, see Talk.Origins.

What can we reasonably concede to unreason?

This post first appeared on the Secular Alliance at Indiana University blog.

In October, SAIU members headed up to Indianapolis for the Center for Inquiry‘s “Defending Science: Challenges and Strategies” workshop. Massimo Pigliucci and Julia Galef, co-hosts of the podcast Rationally Speaking, spoke about natural deficits in reasoning, while Jason Rodriguez and John Shook focused on deliberate attempts to restrict scientific inquiry.

Julia Galef drew our attention to the common assumption that being rational means abandoning all intuition and emotion, an assumption she dismissed as a flimsy Hollywood straw man, or “straw vulcan”. True rationality, Julia suggested, is about the skillful integration of intuitive and deliberative thought. As she noted in a similar talk at the Singularity Summit, these skills demand constant cultivation and vigilance. In their absence, we all predictably fall victim to an array of cognitive biases.

To that end, Galef spoke of suites of indispensable “rationality skills”:

  • Know when to override an intuitive judgment with a reasoned one. Recognize cases where your intuition reliably fails, but also cases where intuition tends to perform better than reason.
  • Learn how to query your intuitive brain. For instance, to gauge how you really feel about a possibility, visualize it concretely, and perform thought experiments to test how different parameters and framing effects are influencing you.
  • Persuade your intuitive system of what your reason already knows. For example: Anna Salamon knew intellectually that wire-guided sky jumps are safe, but was having trouble psyching herself up. So she made her knowledge of statistics concrete, imagining thousands of people jumping before her eyes. This helped trick her affective response into better aligning with her factual knowledge.

Massimo Pigliucci’s talk, “A Very Short Course in Intellectual Self-Defense”, was in a similar vein. Pigliucci drew our attention to common formal and informal fallacies, and to the limits of deductive, inductive, and mathematical thought. Dissenting from Thomas Huxley’s view that ordinary reasoning is a great deal like science, Pigliucci argued that science is cognitively unnatural. This is why untrained reasoners routinely fail to properly amass and evaluate data.

While it’s certainly important to keep in mind how much hard work empirical rigor demands, I think we should retain a qualified version of Huxley’s view. It’s worth emphasizing that careful thought is not the exclusive property of professional academics, that the basic assumptions of science are refined versions of many of the intuitions we use in navigating our everyday environments. Science’s methods are rarefied, but not exotic or parochial. If we forget this, we risk giving too much credence to presuppositionalist apologetics.

Next, Jason Rodriguez discussed the tactics and goals of science organizations seeking to appease, work with, or reach out to the religious. Surveying a number of different views on the creation-evolution debate, Rodriguez questioned when it is more valuable to attack religious doctrines head-on, and when it is more productive to avoid conflict or make concessions.

This led in to John Shook’s vigorous talk, “Science Must Never Compromise With Religion, No Matter the Metaphysical or Theological Temptations”, and a follow-up Rationally Speaking podcast with Galef and Pigliucci. As you probably guessed, it focused on attacking metaphysicians and theologians who seek to limit the scope or undermine the credibility of scientific inquiry. Shook’s basic concern was that intellectuals are undermining the authority of science when they deem some facts ‘scientific’ and others ‘unscientific’. This puts undue constraints on scientific practice. Moreover, it gives undue legitimacy to those philosophical and religious thinkers who think abstract thought or divine revelation grant us access to a special domain of Hidden Truths.

Shook’s strongest argument was against attempts to restrict science to ‘the natural’. If we define ‘Nature’ in terms of what is scientifically knowable, then this is an empty and useless constraint. But defining the natural instead as the physical, or the spatiotemporal, or the unmiraculous, deprives us of any principled reason to call our research programs ‘methodologically naturalistic’. We could imagine acquiring good empirical evidence for magic, for miracles, even for causes beyond our universe. So science’s skepticism about such phenomena is a powerful empirical conclusion. It is not an unargued assumption or prejudice on the part of scientists.

Shook also argued that metaphysics does not provide a special, unscientific source of knowledge; the claims of metaphysicians are pure and abject speculation. I found this part of the talk puzzling. Metaphysics, as the study of the basic features of reality, does not seem radically divorced from theoretical physics and mathematics, which make similar claims to expand at least our pool of conditional knowledge, knowledge of the implications of various models. Yet Shook argued, not for embracing metaphysics as a scientific field, but for dismissing it as fruitless hand-waving.

Perhaps the confusion stemmed from a rival conception of ‘metaphysics’, not as a specific academic field, but as the general practice of drawing firm conclusions about ultimate reality from introspection alone — what some might call ‘armchair philosophy’ or ‘neoscholasticism’. Philosophers of all fields — and, for that matter, scientists — would do well to more fully internalize the dangers of excessive armchair speculation. But the criticism is only useful if it is carefully aimed. If we fixate on ‘metaphysics’ and ‘theology’ as the sole targets of our opprobrium, we risk neglecting the same arrogance in other guises, while maligning useful exploration into the contents, bases, and consequences of our conceptual frameworks. And if we restrict knowledge to science, we risk not only delegitimizing fields like logic and mathematics, but also putting undue constraints on science itself. For picking out a special domain of purported facts as ‘metaphysical’, and therefore unscientific, has exactly the same risks as picking out a special domain as ‘non-natural’ or ‘supernatural’.

To defend science effectively, we have to pick our battles with care. This clearly holds true in public policy and education, where it is most useful in some cases to go for the throat, in other cases to make compromises and concessions. But it also applies to our own personal struggles to become more rational, where we must carefully weigh the costs of overriding our unreasoned intuitions, taking a balanced and long-term approach. And it also holds in disputes over the philosophical foundations and limits of scientific knowledge, where the cost of committing ourselves to unusual conceptions of ‘science’ or ‘knowledge’ or ‘metaphysics’ must be weighed against any argumentative and pedagogical benefits.

This workshop continues to stimulate my thought, and continues to fuel my drive to improve science education. The central insight the speakers shared was that the practices we group together as ‘science’ cannot be defended or promoted in a vacuum. We must bring to light the psychological and philosophical underpinnings of science, or we will risk losing sight of the real object of our hope and concern.