Effective altruists have been discussing animal welfare rather a lot lately, on a few different levels:
1. object-level: How likely is it that conventional food animals suffer?
2. philanthropic: Compared to other causes, how important is non-human animal welfare? How effective are existing organizations and programs in this area? Should effective altruists concentrate attention and resources here?
3. personal-norm: Is it morally acceptable for an individual to use animal products? How important is it to become a vegetarian or vegan?
4. group-norm: Should effective altruist meetings and conventions serve non-vegan food? Should the effective altruist movement rally to laud vegans and/or try to make all effective altruists go vegan?
These questions are all linked, but I’ll mostly focus on 4. For catered EA events, I think it makes sense to default to vegan food whenever feasible, and order other dishes only if particular individuals request them. I’m not a vegan myself, but I think this sends a positive message — that we respect the strength of vegans’ arguments, and the large stakes if they’re right, more than we care about non-vegans’ mild aesthetic preferences.
My views about trying to make as many EAs as possible go vegan are more complicated. As a demonstration of personal virtue, I’d put ‘become a vegan’ in the same (very rough) category as:
- have no carbon footprint.
- buy no product whose construction involved serious exploitation of labor.
- give 10+% of your income to a worthy cause.
- avoid lifestyle choices that have an unsustainable impact on marine life.
- only use antibiotics as a last (or almost-last) resort, so as not to contribute to antibiotic resistance.
- do your best to start a career in effective altruism.
Arguments could be made that many of these are morally obligatory for nearly all people. And most people dismiss these policies too hastily, overestimating the action’s difficulty and underestimating its urgency. Yet, all the same, I’m not confident any of these is universally obligatory — and I’m confident that it’s not a good idea to issue blanket condemnations of everyone who fails to live up to some or all of the above standards, nor to make these actions minimal conditions for respectable involvement in EA.
People with eating disorders can have good grounds for not immediately going vegan. Immunocompromised people can have good grounds for erring on the side of overusing medicine. People trying to dig their way out of debt while paying for a loved one’s medical bills can have good grounds not to give to charity every year.
The deeper problem with treating these as universal Standards of Basic Decency in our community isn’t that we’d be imposing an unreasonable demand on people. It’s that we’d be forcing lots of people to disclose very sensitive details about their personal lives to a bunch of strangers or to the public Internet — physical disabilities, mental disabilities, personal tragedies, intense aversions…. Putting people into a tight spot is a terrible way to get them on board with any of the above proposals, and it’s a great way to make people feel hounded and unsafe in their social circles.
No one’s suggested casting all non-vegans out of our midst. I have, however, heard recent complaints from people who have disabilities that make it unusually difficult to meet some of the above Standards, and who have become less enthusiastic about EA as a result of feeling socially pressured or harangued by EAs to immediately restructure their personal lives. So I think this is something to be aware of and nip in the bud.
In principle, there’s no crisp distinction between ‘personal life’ and ‘EA activities’. There may be lots of private details about a person’s life that would constitute valuable Bayesian evidence about their character, and there may be lots of private activities whose humanitarian impact over a lifetime adds up to be quite large.
Even taking that into account, we should adopt (quasi-)deontic heuristics like ‘don’t pressure people into disclosing a lot about their spending, eating, etc. habits.’ Ends don’t justify means among humans. For the sake of maximizing expected utility, lean toward not jabbing too much at people’s boundaries, and not making it hard for them to have separate private and public lives — even for the sake of maximizing expected utility.
Edit (9/1): Mason Hartman gave the following criticism of this post:
I think putting people into a tight spot is not only not a terrible way to get people on board with veganism, but basically the only way to make a vegan of anyone who hasn’t already become one on their own by 18. Most people like eating meat and would prefer not to be persuaded to stop doing it. Many more people are aware of the factory-like reality of agriculture in 2014 than are vegans. Quietly making the information available to those who seek it out is the polite strategy, but I don’t think it’s anywhere near the most effective one. I’m not necessarily saying we should trade social comfort for greater efficacy re: animal activism, but this article disappoints in that it doesn’t even acknowledge that there is a tradeoff.
Also, all of our Standards of Basic Decency put an “unreasonable demand” (as defined in Robby’s post) on some people. All of them. That doesn’t necessarily mean we’ve made the wrong decision by having them.
In reply: The strategy that works best for public outreach won’t always be best for friends and collaborators, and it’s the latter I’m talking about. I find it a lot more plausible that open condemnation and aggressive uses of social pressure work well for strangers on the street than that they work well for coworkers, romantic partners, etc. (And I’m pretty optimistic that there are more reliable ways to change the behavior of the latter sorts of people, even when they’re past age 18.)
It’s appropriate to have a different set of norms for people you regularly interact with, assuming it’s a good idea to preserve those relationships. This is especially true when groups and relationships involve complicated personal and professional dynamics. I focused on effective altruism because it’s the sort of community that could be valuable, from an animal-welfare perspective, even if a significant portion of the community makes bad consumer decisions. That makes it likelier that we could agree on some shared group norms even if we don’t yet agree on the same set of philanthropic or individual norms.
I’m not arguing that you shouldn’t try to make all EAs vegans, or get all EAs to give 10+% of their income to charity, or make EAs’ purchasing decisions more labor- or environment-friendly in other respects. At this point I’m just raising a worry that should constrain how we pursue those goals, and hopefully lead to new ideas about how we should promote ‘private’ virtue. I’d expect strategies that are very sensitive to EAs’ privacy and boundaries to work better, in that I’d expect them to make it easier for a diverse community of researchers and philanthropists to grow in size, to grow in trust, to reason together, to progressively alter habits and beliefs, and to get some important work done even when there are serious lingering disagreements within the community.
Richard Loosemore recently wrote an essay criticizing worries about AI safety, “The Maverick Nanny with a Dopamine Drip“. (Subtitle: “Debunking Fallacies in the Theory of AI Motivation”.) His argument has two parts. First:
1. Any AI system that’s smart enough to pose a large risk will be smart enough to understand human intentions, and smart enough to rewrite itself to conform to those intentions.
2. Any such AI will be motivated to edit itself and remove ‘errors’ from its own code. (‘Errors’ is a large category, one that includes all mismatches with programmer intentions.)
3. So any AI system that’s smart enough to pose a large risk will be motivated to spontaneously overwrite its utility function to value whatever humans value.
4. Therefore any powerful AGI will be fully safe / friendly, no matter how it’s designed.
5. Logical AI is brittle and inefficient.
6. Neural-network-inspired AI works better, and we know it’s possible, because it works for humans.
7. Therefore, if we want a domain-general problem-solving machine, we should move forward on Loosemore’s proposal, called ‘swarm relaxation intelligence.’
Combining these two conclusions, we get:
8. Since AI is completely safe — any mistakes we make will be fixed automatically by the AI itself — there’s no reason to devote resources to safety engineering. Instead, we should work as quickly as possible to train smarter and smarter neural networks. As they get smarter, they’ll get better at self-regulation and make fewer mistakes, with the result that accidents and moral errors will become decreasingly likely.
I’m not persuaded by Loosemore’s case for point 2, and this makes me doubt claims 3, 4, and 8. I’ll also talk a little about the plausibility and relevance of his other suggestions.
Does intelligence entail docility?
Loosemore’s claim (also made in an older essay, “The Fallacy of Dumb Superintelligence“) is that an AGI can’t simultaneously be intelligent enough to pose a serious risk, but “unsophisticated” enough to disregard its programmers’ intentions. I replied last year in two blog posts (crossposted to Less Wrong).
In “The AI Knows, But Doesn’t Care” I noted that while Loosemore posits an AGI smart enough to correctly interpret natural language and model human motivation, this doesn’t bridge the gap between the ability to perform a task and the motivation, the agent’s decision criteria. In “The Seed is Not the Superintelligence,” I argued, concerning recursively self-improving AI (seed AI):
When you write the seed’s utility function, you, the programmer, don’t understand everything about the nature of human value or meaning. That imperfect understanding remains the causal basis of the fully-grown superintelligence’s actions, long after it’s become smart enough to fully understand our values.
Why is the superintelligence, if it’s so clever, stuck with whatever meta-ethically dumb-as-dirt utility function we gave it at the outset? Why can’t we just pass the fully-grown superintelligence the buck by instilling in the seed the instruction: ‘When you’re smart enough to understand Friendliness Theory, ditch the values you started with and just self-modify to become Friendly.’?
Because that sentence has to actually be coded in to the AI, and when we do so, there’s no ghost in the machine to know exactly what we mean by ‘frend-lee-ness thee-ree’. Instead, we have to give it criteria we think are good indicators of Friendliness, so it’ll know what to self-modify toward.
My claim is that if we mess up on those indicators of friendliness — the criteria the AI-in-progress uses to care about (i.e., factor into its decisions) self-modification toward safety — then it won’t edit itself to care about those factors later, even if it’s figured out that that’s what we would have wanted (and that doing what we want is part of this ‘friendliness’ thing we failed to program it to value).
Loosemore discussed this with me on Less Wrong and on this blog, then went on to explain his view in more detail in the new essay. His new argument is that MIRI and other AGI theorists and forecasters think “AI is supposed to be hardwired with a Doctrine of Logical Infallibility,” meaning “it is incapable of considering the hypothesis that its own reasoning engine may not have taken it to a sensible place”.
Loosemore thinks that if we reject this doctrine, the AI will “understand that many of its more abstract logical atoms have a less than clear denotation or extension in the world”. In addition to recognizing that its reasoning process is fallible, it will recognize that its understanding of terms is fallible and revisable. This includes terms in its representation of its own goals; so the AI will improve its understanding of what it values over time. Since its programmers’ intention was for the AI to have a positive impact on the world, the AI will increasingly come to understand this fact about its values, and will revise its policies to match its (improved interpretation of its) values.
The main problem with this argument occurs at the phrase “understand this fact about its values”. The sentence starts by talking about the programmers’ values, yet it ends by calling this a fact about the AI’s values.
Consider a human trying to understand her parents’ food preferences. As she develops a better model of what her parents mean by ‘delicious,’ of their taste receptors and their behaviors, she doesn’t necessarily replace her own food preferences with her parents’. If her food choices do change as a result, there will need to be some added mechanism that’s responsible — e.g., she will need a specific goal like ‘modify myself to like what others do’.
We can make the point even stronger by considering minds that are alien to each other. If a human studies the preferences of a nautilus, she probably won’t acquire them. Likewise, a human who studies the ‘preferences’ (selection criteria) of an optimization process like natural selection needn’t suddenly abandon her own. It’s not an impossibility, but it depends on the human’s having a very specific set of prior values (e.g., an obsession with emulating animals or natural processes). For the same reason, most decision criteria a recursively self-improving AI could possess wouldn’t cause it to ditch its own values in favor of ours.
If no amount of insight into biology would make you want to steer clear of contraceptives and optimize purely for reproduction, why expect any amount of insight into human values to compel an AGI to abandon all its hopes and dreams and become a humanist? ‘We created you to help humanity!’ we might protest. Yet if evolution could cry out ‘I created you to reproduce!’, we would be neither rationally obliged nor psychologically impelled to comply. There isn’t any theorem of decision theory or probability theory saying ‘rational agents must promote the same sorts of outcomes as the processes that created them, else fail in formally defined tasks’.
Epistemic and instrumental fallibility v. moral fallibility
I don’t know of any actual AGI researcher who endorses Loosemore’s “Doctrine of Logical Infallibility”. (He equates Muehlhauser and Helm’s “Literalness” doctrine with Infallibility in passing, but the link isn’t clear to me, and I don’t see any argument for the identification. The Doctrine is otherwise uncited.) One of the main organizations he critiques, MIRI, actually specializes in researching formal agents that can’t trust their own reasoning, or can’t trust the reasoning of future versions of themselves. This includes work on logical uncertainty (briefly introduced here, at length here) and ’tiling’ self-modifying agents (here).
Loosemore imagines a programmer chiding an AI for the “design error” of pursuing human-harming goals. The human tells the AI that it should fix this error, since it fixed other errors in its code. But Loosemore is conflating programming errors the human makes with errors of reasoning the AI makes. He’s assuming unargued that flaws in an agent’s epistemic and instrumental rationality are of a kind with defects in its moral character or docility.
Any efficient goal-oriented system has convergent instrumental reasons to fix ‘errors of reasoning’ of the kind that are provably obstacles to its own goals. Bostrom discusses this in “The Superintelligent Will,” and Omohundro discusses it in “Rational Artificial Intelligence for the Greater Good,” under the name ‘Basic AI Drives’.
‘Errors of reasoning,’ in the relevant sense, aren’t just things humans think are bad. They’re general obstacles to achieving any real-world goal, and ‘correct reasoning’ is an attractor for systems (e.g., self-improving humans, institutions, or AIs) that can alter their own ability to achieve such goals. If a moderately intelligent self-modifying program lacks the goal ‘generally avoid confirmation bias’ or ‘generally avoid acquiring new knowledge when it would put my life at risk,’ it will add that goal (or something tantamount to it) to its goal set, because it’s instrumental to almost any other goal it might have started with.
On the other hand, if a moderately intelligent self-modifying AI lacks the goal ‘always and forever do exactly what my programmer would ideally wish,’ the number of goals for which it’s instrumental to add that goal to the set is very small, relative to the space of all possible goals. This is why MIRI is worried about AGI; ‘defer to my programmer’ doesn’t appear to be an attractor goal in the way ‘improve my processor speed’ and ‘avoid jumping off cliffs’ are attractor goals. A system that appears amazingly ‘well-designed’ (because it keeps hitting goal after goal of the latter sort) may be poorly-designed to achieve any complicated outcome that isn’t an instrumental attractor, including safety protocols. This is the basis for disaster scenarios like Bostrom on AI deception.
That doesn’t mean that ‘defer to my programmer’ is an impossible goal. It’s just something we have to do the hard work of figuring out ourselves; we can’t delegate the entire task to the AI. It’s a mathematical open problem to define a way for adaptive autonomous AI with otherwise imperfect motivations to defer to programmer oversight and not look for loopholes in its restrictions. People at MIRI and FHI have been thinking about this issue for the past few years; there’s not much published about the topic, though I notice Yudkowsky mentions issues in this neighborhood off-hand in a 2008 blog post about morality.
Do what I mean by ‘do what I mean’!
Loosemore doesn’t discuss in any technical detail how an AI could come to improve its goals over time, but one candidate formalism is Daniel Dewey’s value learning. Following Dewey’s work, Bostrom notes that this general approach (‘outsource some of the problem to the AI’s problem-solving ability’) is promising, but needs much more fleshing out. Bostrom discusses some potential obstacles to value learning in his new book Superintelligence (pp. 192-201):
[T]he difficulty is not so much how to ensure that the AI can understand human intentions. A superintelligence should easily develop such understanding. Rather, the difficulty is ensuring that the AI will be motivated to pursue the described values in the way we intended. This is not guaranteed by the AI’s ability to understand our intentions: an AI could know exactly what we meant and yet be indifferent to that interpretation of our words (being motivated instead by some other interpretation of the words or being indifferent to our words altogether).
The difficulty is compounded by the desideratum that, for reasons of safety, the correct motivation should ideally be installed in the seed AI before it becomes capable of fully representing human concepts or understanding human intentions.
We do not know how to build a general intelligence whose goals are a stable function of human brain states, or patterns of ink on paper, or any other encoding of our preferences. Moreover, merely making the AGI’s goals a function of brain states or ink marks doesn’t help if we make it the wrong function. If the AGI starts off with the wrong function, there’s no reason to expect it to self-correct in the direction of the right one, because (a) having the right function is a prerequisite for caring about self-modifying toward the relevant kind of ‘rightness,’ and (b) having goals that are an ersatz function of human brain-states or ink marks seems consistent with being superintelligent (e.g., with having veridical world-models).
When Loosemore’s hypothetical programmer attempts to argue her AI into friendliness, the AI replies, “I don’t care, because I have come to a conclusion, and my conclusions are correct because of the Doctrine of Logical Infallibility.” MIRI and FHI’s view is that the AI’s actual reply (assuming it had some reason to reply, and to be honest) would invoke something more like “the Doctrine of Not-All-Children-Assigning-Infinite-Value-To-Obeying-Their-Parents.” The task ‘across arbitrary domains, get an AI-in-progress to defer to its programmers when its programmers dislike what it’s doing’ is poorly understood, and looks extremely difficult. Getting a corrigible AI of that sort to ‘learn’ the right values is a second large problem. Loosemore seems to treat corrigibility as trivial, and to equate corrigibility with all other AGI goal content problems.
A random AGI self-modifying to improve its own efficiency wouldn’t automatically self-modify to acquire the values of its creators. We have to actually do the work of coding the AI to have a safe decision-making subsystem. Loosemore is right that it’s desirable for the AI to incrementally learn over time what its values are, so we can make some use of its intelligence to solve the problem; but raw intelligence on its own isn’t the solution, since we need to do the work of actually coding the AI to value executing the desired interpretation of our instructions.
“Correct interpretation” and “instructions” are both monstrously difficult to turn into lines of code. And, crucially, we can’t pass the buck to the superintelligence here. If you can teach an AI to “do what I mean,” you can proceed to teach it anything else; but if you can’t teach it to “do what I mean,” you can’t get the bootstrapping started. In particular, it’s a pretty sure bet you also can’t teach it “do what I mean by ‘do what I mean'”.
Unless you can teach it to do what you mean, teaching it to understand what you mean won’t help. Even teaching an AI to “do what you believe I mean” assumes that we can turn the complex concept “mean” into code.
I’ll run more quickly through some other points Loosemore makes:
a. He criticizes Legg and Hutter’s definition of ‘intelligence,’ arguing that it trivially applies to an unfriendly AI that self-destructs. However, Legg and Hutter’s definition seems to (correctly) exclude agents that self-destruct. On the face of it, Loosemore should be criticizing MIRI for positing an unintelligent AGI, not for positing a trivially intelligent AGI. For a fuller discussion, see Legg and Hutter’s “A Collection of Definitions of Intelligence“.
b. He argues that safe AGI would be “swarm-like,” with elements that are “unpredictably dependent” on non-representational “internal machinery,” because “logic-based AI” is “brittle”. This seems to contradict the views of many specialists in present-day high-assurance AI systems. As Gerwin Klein writes, “everything that makes it easier for humans to think about a system, will help to verify it.” Indiscriminately adding uncertainty or randomness or complexity to a system makes it harder to model the system and check that it has required properties. It may be less “brittle” in some respects, but we have no particular reason to expect safety to be one of those respects. For a fuller discussion, see Muehlhauser’s “Transparency in Safety-Critical Systems“.
c. MIRI thinks we should try to understand safety-critical general reasoning systems as far in advance as possible, and mathematical logic and rational agent models happen to be useful tools on that front. However, MIRI isn’t invested in “logical AI” in the manner of Good Old-Fashioned AI. Yudkowsky and other MIRI researchers are happy to use neural networks when they’re useful for solving a given problem, and equally happy to use other tools for problems neural networks aren’t well-suited to. For a fuller discussion, see Yudkowsky’s “The Nature of Logic” and “Logical or Connectionist AI?“
d. One undercurrent of Loosemore’s article is that we should model AI after humans. MIRI and FHI worry that this would be very unsafe if it led to neuromorphic AI. On the other hand, modeling AI very closely after human brains (approaching the fidelity of whole-brain emulation) might well be a safer option than de novo AI. For a fuller discussion, see Bostrom’s Superintelligence.
On the whole, Loosemore’s article doesn’t engage much with the arguments of other AI theorists regarding risks from AGI.
Assigning less than 5% probability to ‘cows are moral patients’ strikes me as really overconfident. Ditto, assigning greater than 95% probability. (A moral patient is something that can be harmed or benefited in morally important ways, though it may not be accountable for its actions in the way a moral agent is.)
I’m curious how confident others are, and I’m curious about the most extreme confidence levels they’d consider ‘reasonable’.
I also want to hear more about what theories and backgrounds inform people’s views. I’ve seen some relatively extreme views defended recently, and the guiding intuitions seem to have come from two sources:
(1) How complicated is consciousness? In the space of possible minds, how narrow a target is consciousness?
Humans seem to be able to have very diverse experiences — dreams, orgasms, drug-induced states — that they can remember in some detail, and at least appear to be conscious during. That’s some evidence that consciousness is robust to modification and can take many forms. So, perhaps, we can expect a broad spectrum of animals to be conscious.
But what would our experience look like if it were fragile and easily disrupted? There would probably still be edge cases. And, from inside our heads, it would look like we had amazingly varied possibilities for experience — because we couldn’t use anything but our own experience as a baseline. It certainly doesn’t look like a human brain on LSD differs as much from a normal human brain as a turkey brain differs from a human brain.
There’s some risk that we’re overestimating how robust consciousness is, because when we stumble on one of the many ways to make a human brain unconscious, we (for obvious reasons) don’t notice it as much. Drastic changes in unconscious neurochemistry interest us a lot less than minor tweaks to conscious neurochemistry.
And there’s a further risk that we’ll underestimate the complexity of consciousness because we’re overly inclined to trust our introspection and to take our experience at face value. Even if our introspection is reliable in some domains, it has no access to most of the necessary conditions for experience. So long as they lie outside our awareness, we’re likely to underestimate how parochial and contingent our consciousness is.
(2) How quick are you to infer consciousness from ‘intelligent’ behavior?
People are pretty quick to anthropomorphize superficially human behaviors, and our use of mental / intentional language doesn’t clearly distinguish between phenomenal consciousness and behavioral intelligence. But if you work on AI, and have an intuition that a huge variety of systems can act ‘intelligently’, you may doubt that the linkage between human-style consciousness and intelligence is all that strong. If you think it’s easy to build a robot that passes various Turing tests without having full-fledged first-person experience, you’ll also probably (for much the same reason) expect a lot of non-human species to arrive at strategies for intelligently planning, generalizing, exploring, etc. without invoking consciousness. (Especially if your answer to question 1 is ‘consciousness is very complex’. Evolution won’t put in the effort to make a brain conscious unless it’s extremely necessary for some reproductive advantage.)
… But presumably there’s some intelligent behavior that was easier for a more-conscious brain than for a less-conscious one — at least in our evolutionary lineage, if not in all possible lineages that reproduce our level of intelligence. We don’t know what cognitive tasks forced our ancestors to evolve-toward-consciousness-or-perish. At the outset, there’s no special reason to expect that task to be one that only arose for proto-humans in the last few million years.
Even if we accept that the machinery underlying human consciousness is very complex, that complex machinery could just as easily have evolved hundreds of millions of years ago, rather than tens of millions. We’d then expect it to be preserved in many nonhuman lineages, not just in humans. Since consciousness-of-pain is mostly what matters for animal welfare (not, e.g., consciousness-of-complicated-social-abstractions), we should look into hypotheses like:
first-person consciousness is an adaptation that allowed early brains to represent simple policies/strategies and visualize plan-contingent sensory experiences.
Do we have a specific cognitive reason to think that something about ‘having a point of view’ is much more evolutionarily necessary for human-style language or theory of mind than for mentally comparing action sequences or anticipating/hypothesizing future pain? If not, the data of ethology plus ‘consciousness is complicated’ gives us little reason to favor the one view over the other.
We have relatively direct positive data showing we’re conscious, but we have no negative data showing that, e.g., salmon aren’t conscious. It’s not as though we’d expect them to start talking or building skyscrapers if they were capable of experiencing suffering — at least, any theory that predicts as much has some work to do to explain the connection. At present, it’s far from obvious that the world would look any different than it does even if all vertebrates were conscious.
So… the arguments are a mess, and I honestly have no idea whether cows can suffer. The probability seems large enough to justify ‘don’t torture cows (including via factory farms)’, but that’s a pretty low bar, and doesn’t narrow the probability down much.
To the extent I currently have a favorite position, it’s something like: ‘I’m pretty sure cows are unconscious on any simple, strict, nondisjunctive definition of “consciousness;” but what humans care about is complicated, and I wouldn’t be surprised if a lot of ‘unconscious’ information-processing systems end up being counted as ‘moral patients’ by a more enlightened age. … But that’s a pretty weird view of mine, and perhaps deserves a separate discussion.
I could conclude with some crazy video of a corvid solving a rubik’s cube or an octopus breaking into a bank vault or something, but I somehow find this example of dog problem-solving more compelling:
Oxford philosopher Nick Bostrom has argued, in “The Superintelligent Will,” that advanced AIs are likely to diverge in their terminal goals (i.e., their ultimate decision-making criteria), but converge in some of their instrumental goals (i.e., the policies and plans they expect to indirectly further their terminal goals). An arbitrary superintelligent AI would be mostly unpredictable, except to the extent that nearly all plans call for similar resources or similar strategies. The latter exception may make it possible for us to do some long-term planning for future artificial agents.
Bostrom calls the idea that AIs can have virtually any goal the orthogonality thesis, and he calls the idea that there are attractor strategies shared by almost any goal-driven system (e.g., self-preservation, knowledge acquisition) the instrumental convergence thesis.
Bostrom fleshes out his worries about smarter-than-human AI in the book Superintelligence: Paths, Dangers, Strategies, which came out in the US a few days ago. He says much more there about the special technical and strategic challenges involved in general AI. Here’s one of the many scenarios he discusses, excerpted:
[T]he orthogonality thesis suggests that we cannot blithely assume that a superintelligence will necessarily share any of the final values stereotypically associated with wisdom and intellectual development in humans — scientific curiosity, benevolent concern for others, spiritual enlightenment and contemplation, renunciation of material acquisitiveness, a taste for refined culture or for the simple pleasures in life, humility and selflessness, and so forth. We will consider later whether it might be possible through deliberate effort to construct a superintelligence that values such things, or to build one that values human welfare, moral goodness, or any other complex purpose its designers might want it to serve. But it is no less possible — and in fact technically a lot easier — to build a superintelligence that places final value on nothing but calculating the decimal expansion of pi. This suggests that — absent a specific effort — the first superintelligence may have some such random or reductionistic final goal.
[... T]he instrumental convergence thesis entails that we cannot blithely assume that a superintelligence with the final goal of calculating the decimals of pi (or making paperclips, or counting grains of sand) would limit its activities in such a way as not to infringe on human interests. An agent with such a final goal would have a convergent instrumental reason, in many situations, to acquire an unlimited amount of physical resources and, if possible, to eliminate potential threats to itself and its goal system. Human beings might constitute potential threats; they certainly constitute physical resources. [...]
It might seem incredible that a project would build or release an AI into the world without having strong grounds for trusting that the system will not cause an existential catastrophe. It might also seem incredible, even if one project were so reckless, that wider society would not shut it down before it (or the AI it was building) attains a decisive strategic advantage. But as we shall see, this is a road with many hazards. [...]
With the help of the concept of convergent instrumental value, we can see the flaw in one idea for how to ensure superintelligence safety. The idea is that we validate the safety of a superintelligent AI empirically by observing its behavior while it is in a controlled, limited environment (a “sandbox”) and that we only let the AI out of the box if we see it behaving in a friendly, cooperative, responsible manner.
The flaw in this idea is that behaving nicely while in the box is a convergent instrumental goal for friendly and unfriendly AIs alike. An unfriendly AI of sufficient intelligence realizes that its unfriendly final goals will be best realized if it behaves in a friendly manner initially, so that it will be let out of the box. It will only start behaving in a way that reveals its unfriendly nature when it no longer matters whether we find out; that is, when the AI is strong enough that human opposition is ineffectual.
Consider also a related set of approaches that rely on regulating the rate of intelligence gain in a seed AI by subjecting it to various kinds of intelligence tests or by having the AI report to its programmers on its rate of progress. At some point, an unfriendly AI may become smart enough to realize that it is better off concealing some of its capability gains. It may underreport on its progress and deliberately flunk some of the harder tests, in order to avoid causing alarm before it has grown strong enough to attain a decisive strategic advantage. The programmers may try to guard against this possibility by secretly monitoring the AI’s source code and the internal workings of its mind; but a smart-enough AI would realize that it might be under surveillance and adjust its thinking accordingly. The AI might find subtle ways of concealing its true capabilities and its incriminating intent. (Devising clever escape plans might, incidentally, also be a convergent strategy for many types of friendly AI, especially as they mature and gain confidence in their own judgments and capabilities. A system motivated to promote our interests might be making a mistake if it allowed us to shut it down or to construct another, potentially unfriendly AI.)
We can thus perceive a general failure mode, wherein the good behavioral track record of a system in its juvenile stages fails utterly to predict its behavior at a more mature stage. Now, one might think that the reasoning described above is so obvious that no credible project to develop artificial general intelligence could possibly overlook it. But one should not be too overconfident that this is so.
Consider the following scenario. Over the coming years and decades, AI systems become gradually more capable and as a consequence find increasing real-world application: they might be used to operate trains, cars, industrial and household robots, and autonomous military vehicles. We may suppose that this automation for the most part has the desired effects, but that the success is punctuated by occasional mishaps — a driverless truck crashes into oncoming traffic, a military drone fires at innocent civilians. Investigations reveal the incidents to have been caused by judgment errors by the controlling AIs. Public debate ensues. Some call for tighter oversight and regulation, others emphasize the need for research and better-engineered systems — systems that are smarter and have more common sense, and that are less likely to make tragic mistakes. Amidst the din can perhaps also be heard the shrill voices of doomsayers predicting many kinds of ill and impending catastrophe. Yet the momentum is very much with the growing AI and robotics industries. So development continues, and progress is made. As the automated navigation systems of cars become smarter, they suffer fewer accidents; and as military robots achieve more precise targeting, they cause less collateral damage. A broad lesson is inferred from these observations of real-world outcomes: the smarter the AI, the safer it is. It is a lesson based on science, data, and statistics, not armchair philosophizing. Against this backdrop, some group of researchers is beginning to achieve promising results in their work on developing general machine intelligence. The researchers are carefully testing their seed AI in a sandbox environment, and the signs are all good. The AI’s behavior inspires confidence — increasingly so, as its intelligence is gradually increased.
At this point, any remaining Cassandra would have several strikes against her:
i A history of alarmists predicting intolerable harm from the growing capabilities of robotic systems and being repeatedly proven wrong. Automation has brought many benefits and has, on the whole, turned out safer than human operation.
ii A clear empirical trend: the smarter the AI, the safer and more reliable it has been. Surely this bodes well for a project aiming at creating machine intelligence more generally smart than any ever built before — what is more, machine intelligence that can improve itself so that it will become even more reliable.
iii Large and growing industries with vested interests in robotics and machine intelligence. These fields are widely seen as key to national economic competitiveness and military security. Many prestigious scientists have built their careers laying the groundwork for the present applications and the more advanced systems being planned.
iv A promising new technique in artificial intelligence, which is tremendously exciting to those who have participated in or followed the research. Although safety issues and ethics are debated, the outcome is preordained. Too much has been invested to pull back now. AI researchers have been working to get to human-level artificial intelligence for the better part of a century: of course there is no real prospect that they will now suddenly stop and throw away all this effort just when it finally is about to bear fruit.
v The enactment of some safety rituals, whatever helps demonstrate that the participants are ethical and responsible (but nothing that significantly impedes the forward charge).
vi A careful evaluation of seed AI in a sandbox environment, showing that it is behaving cooperatively and showing good judgment. After some further adjustments, the test results are as good as they could be. It is a green light for the final step . . .
And so we boldly go — into the whirling knives.
We observe here how it could be the case that when dumb, smarter is safe; yet when smart, smarter is more dangerous. There is a kind of pivot point, at which a strategy that has previously worked excellently suddenly starts to backfire.
For more on terminal goal orthogonality, see Stuart Armstrong’s “General Purpose Intelligence“. For more on instrumental goal convergence, see Steve Omohundro’s “Rational Artificial Intelligence for the Greater Good“.
Eliezer Yudkowsky has written a delightful series of posts (originally on the economics blog Overcoming Bias) about why partisan debates are so frequently hostile and unproductive. Particularly incisive is A Fable of Science and Politics.
One of the broader points Eliezer makes is that, while political issues are important, political discussion isn’t the best place to train one’s ability to look at issues objectively and update on new evidence. The way I’d put it is that politics is hard mode; it takes an extraordinary amount of discipline and skill to communicate effectively in partisan clashe.
This jibes with my own experience; I’m much worse at arguing politics than at arguing other things. And psychological studies indicate that politics is hard mode even (or especially!) for political veterans; see Taber & Lodge (2006).
Eliezer’s way of putting the same point is (riffing off of Dune): ‘Politics is the Mind-Killer.’ An excerpt from that blog post:
Politics is an extension of war by other means. Arguments are soldiers. Once you know which side you’re on, you must support all arguments of that side, and attack all arguments that appear to favor the enemy side; otherwise it’s like stabbing your soldiers in the back — providing aid and comfort to the enemy. [...]
I’m not saying that I think Overcoming Bias should be apolitical, or even that we should adopt Wikipedia’s ideal of the Neutral Point of View. But try to resist getting in those good, solid digs if you can possibly avoid it. If your topic legitimately relates to attempts to ban evolution in school curricula, then go ahead and talk about it — but don’t blame it explicitly on the whole Republican Party; some of your readers may be Republicans, and they may feel that the problem is a few rogues, not the entire party. As with Wikipedia’s NPOV, it doesn’t matter whether (you think) the Republican Party really is at fault. It’s just better for the spiritual growth of the community to discuss the issue without invoking color politics.
Scott Alexander fleshes out why it can be dialogue-killing to attack big groups (even when the attack is accurate) in another blog post, Weak Men Are Superweapons. And Eliezer expands on his view of partisanship in follow-up posts like The Robbers Cave Experiment and Hug the Query.
Some people involved in political advocacy and activism have objected to the “mind-killer” framing. Miri Mogilevsky of Brute Reason explained on Facebook:
My usual first objection is that it seems odd to single politics out as a “mind-killer” when there’s plenty of evidence that tribalism happens everywhere. Recently, there has been a whole kerfuffle within the field of psychology about replication of studies. Of course, some key studies have failed to replicate, leading to accusations of “bullying” and “witch-hunts” and what have you. Some of the people involved have since walked their language back, but it was still a rather concerning demonstration of mind-killing in action. People took “sides,” people became upset at people based on their “sides” rather than their actual opinions or behavior, and so on.
Unless this article refers specifically to electoral politics and Democrats and Republicans and things (not clear from the wording), “politics” is such a frightfully broad category of human experience that writing it off entirely as a mind-killer that cannot be discussed or else all rationality flies out the window effectively prohibits a large number of important issues from being discussed, by the very people who can, in theory, be counted upon to discuss them better than most. Is it “politics” for me to talk about my experience as a woman in gatherings that are predominantly composed of men? Many would say it is. But I’m sure that these groups of men stand to gain from hearing about my experiences, since some of them are concerned that so few women attend their events.
In this article, Eliezer notes, “Politics is an important domain to which we should individually apply our rationality — but it’s a terrible domain in which to learn rationality, or discuss rationality, unless all the discussants are already rational.” But that means that we all have to individually, privately apply rationality to politics without consulting anyone who can help us do this well. After all, there is no such thing as a discussant who is “rational”; there is a reason the website is called “Less Wrong” rather than “Not At All Wrong” or “Always 100% Right.” Assuming that we are all trying to be more rational, there is nobody better to discuss politics with than each other.
The rest of my objection to this meme has little to do with this article, which I think raises lots of great points, and more to do with the response that I’ve seen to it — an eye-rolling, condescending dismissal of politics itself and of anyone who cares about it. Of course, I’m totally fine if a given person isn’t interested in politics and doesn’t want to discuss it, but then they should say, “I’m not interested in this and would rather not discuss it,” or “I don’t think I can be rational in this discussion so I’d rather avoid it,” rather than sneeringly reminding me “You know, politics is the mind-killer,” as though I am an errant child. I’m well-aware of the dangers of politics to good thinking. I am also aware of the benefits of good thinking to politics. So I’ve decided to accept the risk and to try to apply good thinking there. [...]
I’m sure there are also people who disagree with the article itself, but I don’t think I know those people personally. And to add a political dimension (heh), it’s relevant that most non-LW people (like me) initially encounter “politics is the mind-killer” being thrown out in comment threads, not through reading the original article. My opinion of the concept improved a lot once I read the article.
In the same thread, Andrew Mahone added, “Using it in that sneering way, Miri, seems just like a faux-rationalist version of ‘Oh, I don’t bother with politics.’ It’s just another way of looking down on any concerns larger than oneself as somehow dirty, only now, you know, rationalist dirty.” To which Miri replied: “Yeah, and what’s weird is that that really doesn’t seem to be Eliezer’s intent, judging by the eponymous article.”
Eliezer clarified that by “politics” he doesn’t generally mean ‘problems that can be directly addressed in local groups but happen to be politically charged':
Hanson’s “Tug the Rope Sideways” principle, combined with the fact that large communities are hard to personally influence, explains a lot in practice about what I find suspicious about someone who claims that conventional national politics are the top priority to discuss. Obviously local community matters are exempt from that critique! I think if I’d substituted ‘national politics as seen on TV’ in a lot of the cases where I said ‘politics’ it would have more precisely conveyed what I was trying to say.
Even if polarized local politics is more instrumentally tractable, though, the worry remains that it’s a poor epistemic training ground. A subtler problem with banning “political” discussions on a blog or at a meet-up is that it’s hard to do fairly, because our snap judgments about what counts as “political” may themselves be affected by partisan divides. In many cases the status quo is thought of as apolitical, even though objections to the status quo are ‘political.’ (Shades of Pretending to be Wise.)
Because politics gets personal fast, it’s hard to talk about it successfully. But if you’re trying to build a community, build friendships, or build a movement, you can’t outlaw everything ‘personal.’ And selectively outlawing personal stuff gets even messier. Last year, daenerys shared anonymized stories from women, including several that discussed past experiences where the writer had been attacked or made to feel unsafe. If those discussions are made off-limits because they’re ‘political,’ people may take away the message that they aren’t allowed to talk about, e.g., some harmful or alienating norm they see at meet-ups. I haven’t seen enough discussions of this failure mode to feel super confident people know how to avoid it.
Since this is one of the LessWrong memes that’s most likely to pop up in discussions between different online communities (along with the even more ripe-for-misinterpretation “policy debates should not appear one-sided“…), as a first (very small) step, I suggest obsoleting the ‘mind-killer’ framing. It’s cute, but ‘politics is hard mode’ works better as a meme to interject into random conversations. ∵:
1. ‘Politics is hard mode’ emphasizes that ‘mind-killing’ (= epistemic difficulty) is quantitative, not qualitative. Some things might instead fall under Very Hard Mode, or under Middlingly Hard Mode…
2. ‘Hard’ invites the question ‘hard for whom?’, more so than ‘mind-killer’ does. We’re all familiar with the fact that some people and some contexts change what’s ‘hard’, so it’s a little less likely we’ll universally generalize about what’s ‘hard.’
3. ‘Mindkill’ connotes contamination, sickness, failure, weakness. ‘Hard Mode’ doesn’t imply that a thing is low-status or unworthy, so it’s less likely to create the impression (or reality) that LessWrongers or Effective Altruists dismiss out-of-hand the idea of hypothetical-political-intervention-that-isn’t-a-terrible-idea. Maybe some people do want to argue for the thesis that politics is always useless or icky, but if so it should be done in those terms, explicitly — not snuck in as a connotation.
4. ‘Hard Mode’ can’t readily be perceived as a personal attack. If you accuse someone of being ‘mindkilled’, with no context provided, that clearly smacks of insult — you appear to be calling them stupid, irrational, deluded, or similar. If you tell someone they’re playing on ‘Hard Mode,’ that’s very nearly a compliment, which makes your advice that they change behaviors a lot likelier to go over well.
5. ‘Hard Mode’ doesn’t carry any risk of evoking (e.g., gendered) stereotypes about political activists being dumb or irrational or overemotional.
6. ‘Hard Mode’ encourages a growth mindset. Maybe some topics are too hard to ever be discussed. Even so, ranking topics by difficulty still encourages an approach where you try to do better, rather than merely withdrawing. It may be wise to eschew politics, but we should not fear it. (Fear is the mind-killer.)
If you and your co-conversationalists haven’t yet built up a lot of trust and rapport, or if tempers are already flaring, conveying the message ‘I’m too rational to discuss politics’ or ‘You’re too irrational to discuss politics’ can make things worse. ‘Politics is the mind-killer’ is the mind-killer. At least, it’s a relatively mind-killing way of warning people about epistemic hazards.
‘Hard Mode’ lets you communicate in the style of the Humble Aspirant rather than the Aloof Superior. Try something in the spirit of: ‘I’m worried I’m too low-level to participate in this discussion; could you have it somewhere else?’ Or: ‘Could we talk about something closer to Easy Mode, so we can level up together?’ If you’re worried that what you talk about will impact group epistemology, I think you should be even more worried about how you talk about it.
Kate Donovan said of the above comic “This is the Robbiest xkcd I’ve seen.”, which is one of my favorite compliments of all time. I love discombobulating words; and recombobulating them; really, bobulating them in all sorts of ways. Though especially in ways that make new poetries possible, or lead to new insights about the world and its value.
I’m very fond of the approach of restricting myself to common words (Up-Goer Five), and of other systematic approaches. But I think my favorite of all is the artificial language Anglish: English using only native roots.
Although English is a Germanic language, only 1/4 of modern English words (that you’ll find in the Shorter Oxford Dictionary) have Germanic roots. The rest mostly come from Latin, either directly or via French. This borrowing hasn’t just expanded our vocabulary; it’s led to the loss of countless native English words which were replaced by synonyms perceived as more formal or precise. Since a lot of these native words are just a joy to say, since their use sheds light on many of English’s vestigial features, and since derivations from English words are often far easier to break down and parse than lengthy classical coinings (e.g., needlefear rather than aichmophobia), Anglo-Saxon linguistic purists are compiling a dictionary to translate non-native words into Germanic equivalents. Some of the more entertaining entries follow.
A and B
- abduct = neednim
- abet = frofer
- abhor = mislike
- abominable = wargly
- abortion = scrapping
- accelerate = swiften
- accessible = to-goly
- accident = mishappening
- accordion = bellowharp
- active = deedy
- adherent = clinger, liefard
- adolescent = halfling, younker, frumbeardling
- adrenaline = bykidney workstuff
- adulation = flaundering, glavering
- adversity = thwartsomeness, hardhap
- Afghan = Horsemanlandish
- afraid = afeared
- Africa = Sunnygreatland
- aged = oldened
- agglomerate = clodden
- aggressive = fighty
- agitation = fret of mind
- AIDS = Earned Bodyweir Scantness Sickness
- airplane = loftcraft
- albino = whiteling
- alcoholic = boozen
- altercation = brangling
- America = Markland, Amerigoland, Wineland
- anathema = accursed thing
- angel = errand-ghost
- anglicization = englishing
- anime = dawnlandish livedrawing
- annihilate = benothingen
- antecedence = beforemath
- anthropology = folklore
- anti- = nomore-
- antimatter = unstuff
- antiquity = oldendom
- antisemitism = jewhate
- aorta = lofty heartpipe
- apostle = sendling
- arithmetic = talecraft
- arm (v.) = beweapon
- armadillo = girdledeer
- arrest = avast
- artificial = craftly
- asparagus = sparrowgrass
- assassinated = deathcrafted
- assembly = forsamening
- audacious = daresome, ballsy
- augment = bemore, eken
- August = Weedmonth
- autopsy = open bodychecking
- avalanche = fellfall
- avant garde = forhead
- avert = forfend, forethwart
- ballet = fairtumb
- ballistics = shootlore
- balloon = loftball
- banana = moonapple, finger-berry
- banquet = benchsnack
- barracks = warbarn
- basketball = cawlball
- bastard = mingleling, lovechild
- battlefield = hurlyburlyfield, badewang
- beau = ladfriend, fop
- beautiful = eyesome, goodfaced
- behavioral economics = husbandry of the how
- Belgium = Belgy
- bestiality = deerlust
- betrayer = unfriend, foe-friend, mouth friend
- bicameral = twifackly
- bisexuality = twilust
- blame = forscold
- blasphemy = godsmear
- bong = waterpipe
- bourgeois = highburger
- boutique = dressshop
- braggart = mucklemouth
- braille = the Blind’s rune
- brassiere = underbodice
- bray = heehaw
- breakable = brittle, brickle, breaksome, bracklesome
- breeze = windlick
- buggery = arseswiving
- burlesque = funnish
- butter = cowsmear
C and D
- calculus = flowreckoning
- campus = lorefield
- cancerous = cankersome
- capacity = holdth
- capsize = wemmel
- carbon dioxide = twiathemloft chark, onecoal-twosour-stuff, fizzloft
- carnal attraction = fleshbesmittenness
- cartouche = stretched foreverness-rope
- catechism = godlore handbook
- caterpillar = Devil’s cat, hairy cat, butterfly worm
- catheter = bodypipe
- cattle = kine
- cause (n.) = bring-about, onlet, wherefrom
- cell = hole, room, frume, lifebrick
- cell division = frumecleaving
- cell membrane = frumenfilmen
- cement = brickstick
- cerebellum = brainling
- certainly = forsooth, soothly, in sooth
- cerulean = woadish
- chaos = mayhem, dwolm, topsy-turvydom, unfrith
- character = selfsuchness
- charity = givefulness
- chocolate = sweetslime
- circumcise = umcut
- circumstance = boutstanding, happenstanding
- civilization = couthdom, settledhood
- civilize = tame, couthen
- clamor = greeding
- clarify = clearen
- classification = bekinding
- clavicle = sluttlebone
- cliche = unthought-up saying, oftquote, hackney
- clinic = sickbay
- clockwise = sunwise
- coffer = hoardvat
- coitus = swiving, bysleep
- color = huecast, light wavelength
- combine = gatherbind
- comedian = funspeller, lustspeller, laughtersmith
- comedy = funplay, lustplay
- comestible = eatsome, a food thing
- comfort = frover, weem, soothfulness
- comfortable = weemly, froverly
- comment = umspeech
- CD-ROM = WR-ROB (withfasted-ring-read-only bemindings)
- companion = helpmate
- comparative anatomy = overlikening bodylore
- compare = aliken, gainsame liken, game off against
- complexion = blee, skin-look
- compliant = followsome
- composition = nibcraft
- concentrated = squished together
- concentration camp = cramming-laystead
- concentric = middlesharing
- condition = fettle
- condom = scumbag
- conscience = inwit, heart’s eye
- convergence = togethering
- convert = bewhirve
- copious = beteeming
- corner = nook, winkle
- correction fluid = white-out
- corridor = hallway
- corrugated = wrizzled
- Costa Rican = Rich Shorelander
- Cote d’Ivoire = Elfbone Shoreland
- cotton = treewool
- coward = dastard, arg
- creme de la crem = bee’s knees
- criterion = deemmean
- cytoskeleton = frumenframework
- dairy = deyhouse, milkenstuff
- danger = freech, deathen
- data = put, rawput, meteworths
- database = putbank
- deceive = swike, beswike, fop, wimple
- defame = shend, befile
- defeat = netherthrow
- defenestrate = outwindowse
- deify = begod
- delusion = misbelief
- demeanour = jib
- demilitarized = unlandmighted
- dependence = onhanginess
- descendent = afterbear, afterling
- despair = wanhope
- dinosaur = forebird
- disarrange = rumple
- disaster = harrow-hap, ill-hap, banefall, baneburst, grimming
- disinfect = unsmittle
- disprove = belie
- disturbance = dreefing, dreep-hap
- divination = weedgle
- division = tweeming
Um, all the other ones
- ease (n.) = eath, frith of mind
- egalitarianism = evendom
- electricity = sparkflow, ghostfire
- electron = amberling
- elevate = aloofen
- embryo = womb-berry
- enable = canen, mayen
- enact = umdo, emdo
- encryption = forkeying
- energy = dodrive, inwork, spring
- ensnare = swarl
- enthusiasm = faith-heat
- environment = lifescape, setting, umwhirft
- enzyme = yeaster, yeastuff
- ephemeral = dwimmerly
- equation = likening, besaming
- ethnic minority = outlandish fellowship
- evaluate = bedeem, bereckon, beworthen
- example = bisen, byspell, lodestar, forbus
- exaptation = kludging
- existent = wesand, forelying, issome
- face = nebb, andlit, leer, hue, blee, mug
- fair (n.) = hoppings
- female = she-kind
- fetid = flyblown, smellful, stenchy
- figment = farfetchery
- fornication = whorery, awhoring
- fray = frazzle
- fugitive = lamster, flightling
- gas-powered = waftle-driven
- gland = threeze
- history = yorelore, olds, eretide
- Homo sapiens = Foppish man
- horror = grir
- ignorance = unskill, unwittleness
- impossible = unmightly
- incorrect = unyearight
- increase = formore, bemoren
- independence = unoffhangingness
- indiscriminately = shilly-shally, allwardly
- infancy = babytime
- intoxication = bedrunkenhood
- invasion = inslaught
- jolly = full of beans
- juggernaut = blindblooter
- kamikaze = selfkilling loftstrike
- kangaroo = hopdeer
- laser = lesyr (light eking by spurred yondputting of rodding)
- limerence = crush
- lumpenproletariat = underrabble
- lysosome = selfkillbag
- malicious = argh, evilful
- maltreat = misnutt
- mammal = suckdeer, suckledeer
- March = Winmonth
- marsupial = pungsucker
- martyr = bloot
- megalopolis = mickleborough
- mercy = milds
- mitochondrion = mightcorn
- mock = geck, betwit
- nanotechnology = motish witcraft, smartdust
- natural selection = unmanmade sieving
- nostalgia = yesterlove
- nursery = childergarden
- ocean = the great sea, the blue moor, sailroad, the brine
- old-fashioned = old-fangled
- orchid = wombbloom
- palindrome = drowword
- pervert = lewdster
- pianoforte = softhard keyboard
- pregnancy = childladenhood
- prehistory = aforeyorelore, yesteryore
- quid pro quo = tit for tat
- revolution = whirft, umbewrithing
- romanticism = lovecraft, storm-and-throng-troth
- sagacious = hyesnotter, sarrowthankle, wisehidey, yarewittle
- satire = scoldcraft
- scarab = turd-weevil
- science = learncraft, the knowledges
- second = twoth
- somnolent = sloomy
- spirit = poost
- sublingual salivary glands = undertungish spittlethreezen
- sugar = beeless honey
- tabernacle = worship booth
- underpants = netherbritches
- undulating = wimpling
- unintelligent = unthinkle
- usurer = wastomhatster, wookerer
- velociraptor = dashsnatcher
- volcano = fireberg, welkinoozer
- vowel = stevening
- voyage = farfare
- walrus = horsewhale
You have been gifted a new Dadaist superpower. I release you unto the world with it.
Last month I proposed a new solution to the problem of choosing family names: When you’re starting a family, you and your partners construct and adopt an entirely new middle name, a ‘union name‘ symbolizing your shared life and shared values. If you have children, this union name then becomes their surname.
Many people voiced enthusiasm about the idea, but many also raised interesting concerns and criticisms. I’ve collected them here, with my responses.
Objection 1: It’s better for women to be subordinate, and patrilineal family names help reinforce that. Patriarchal families and societies are happier, stabler, and more successful.
Response: Most women seem to want more autonomy, not less (Pew 2010). That’s very surprising, if autonomy makes them worse off. In fact, that bit of evidence on its own mostly settles the question, until we get strong evidence that women are systematically wrong in this highly specific way about their own interests. We find ourselves in a position similar to an abolitionist trying mightily to refute the claim that Africans love being slaves. Five minutes of talking to people, in a setting where they can talk freely, does the job, and we can move on to more interesting matters.
If there’s compelling evidence to the contrary, I’ll need to see it before I can say much more. On the political claim, too — I need some reason to doubt the surface-level appearance ‘gender equity makes societies more prosperous (Dollar & Gatti 1991; Brummett 2008), healthy (Kawachi et al. 1999), and just (Melander 2005)’.
Objection 2: Union names are too convenient. We should retain an annoying, difficult system, because then it will be more diagnostic of future relationship woes. If people have to fight over whose name gets passed on to the kids, that will ruin relationships that wouldn’t, or shouldn’t, have lasted.
Response: In general, it’s bad policy to make people’s lives worse as a test or trial, unless you are in desperate need for the data that such a test is likely to provide. (And have no other way to acquire such data.) There may be two false assumptions going into the above objection:
(i) Small inconveniences don’t matter. Lots and lots of small inconveniences distributed over a population will add up to have a big impact. And if by chance a lot of them happen in your life at once, they can certainly feel big! People are often under an unusual amount of pressure when they’re deciding whether to have kids or begin a serious long-term relationship. If there’s anything we can do to make their challenges at that stage in life more fun, inspiring, and pleasant, we should jump on the chance.
(ii) If you break up for dumb reasons, you would have broken up eventually anyway. That’s not how relationships work. First, relationships don’t remain at the exact same strength at all times; they can grow in strength. (Or shrink, or oscillate.) Second, failing to overcome a low-level challenge isn’t proof that you would also have failed at all high-level challenges. Bad break-ups can occur just because the wrong thing happened at the wrong time. Life is chaotic, and love’s dynamics are not constrained by what should have happened.
The take-away from this is that we should have compassion and try to make people’s lives better, in small ways and large ones. People don’t deserve solitude or angst just because we find the reason behind their relationship troubles silly.
Also: Union names are challenging. They do help test the strength of people’s commitment. But they do so in a way that tests a relevant skill for romantic and familial relationships. The ability to collaborate, make mutual compromises, come up with imaginative solutions, and find common ground — that’s what union names are training and testing. The ability to be dominant or subordinate, to demand unequal sacrifices, to adhere to out-of-date social norms — that’s what more traditional naming systems are training and testing. I think the former skills are more important for more people.
Objection 3: Naming your children from scratch is hard. Our naming conventions should streamline the process, not add more complexity.
Response: I’d expect social conventions to arise that give people obvious standard choices for surnames — name X after loved one Y, give X common popular name Y, … — so that most people don’t end up inventing names from scratch. That’s how given names currently work, so it’s probably how union names will work too.
As for why we should add even a small amount of work to the process: Human names actually matter. They can have a much bigger and more direct impact on our self-image and social relations than inanimate object names can. If union names encourage people to think and talk more carefully and cooperatively about what identity they want for themselves and their children, great!
Objection 4: Parents can’t be trusted to make up entirely new names for their children. Look how terrible they are just at coming up with decent first names!
Response: It’s certainly a shortcoming of union names that they allow parents to screw up their kids’ lives in more drastic ways. However, if we have mechanisms in place for keeping parents from choosing seriously socially harmful first names for kids, then those mechanisms should generalize to socially harmful surnames.
(In fact, giving parents more leeway might force bureaucrats to take this problem more seriously and put more laws on the books. So the end result could well be fewer irresponsible name choices.)
Objection 5: Giving people so much control makes it likely they’ll later be less happy with it. If you give them less freedom, they’ll grow attached to their choice and rationalize it more readily.
Response: Entirely true! In general, giving people more freedom lets them select more personalized options, but also makes them more indecisive, anxious, and likely to regret their decision. See Dan Gilbert’s excellent talk on synthetic happiness:
I accept this as a cost, but I think it’s worth it for all the advantages union names confer. Ultimately, we’ll just have to try them and see how they work. If binding families together in a more free, egalitarian, imaginative, and collaborative way doesn’t end up having as many (foreseen or unforeseen) benefits as one might suspect, then a much simpler, more automated system may turn out to be superior.
If people really just don’t care that much about surnames, then you could, for example, flip a coin to decide whose name gets taken on by everyone else. But my suspicion is that trivializing family bonds in that way isn’t the best solution available. (For instance, the parent who randomly has to change eir name may not be the one in the better position to bear the associated social costs.)
Objection 6: So why not just use a coin flip to decide which surname the children get, but let the parents have completely different names? Or leave the parents’ names intact, but use some arbitrary system to assign surnames to the children? For example, you could give the first child the alphabetically earliest surname of its parents, the second child the second-earliest surname, then keep cycling through.
Response: Coin flips and arbitrary conventions are admirably fair. But they still bear the cost of making the whole process seem meaningless and impersonal. Why not humanize and personalize our naming conventions, if we’ve found a relatively easy and simple way to do so?
I’m also wary of systems that give different surnames to the children, even randomly. First, I don’t want to encourage parents, even a little bit, to choose how many children they have based on an implicit desire to pass on their name, or on an implicit desire to equalize the distribution of names, or what-have-you. People’s decision-making is capricious and destructive enough without society going out of its way to distract them with shiny gold Name coins.
Second, I don’t want to factionalize families. These proposals all have the disadvantage of frequently leaving one family member excluded from an important symbolic tie that binds the rest of the family together. Compared to other systems, unity names are just what they sound like. They encourage familial unity more than any alternative does. They create a symbol that ties everyone in the group together, with no one left out in the cold, favored over the rest, or cut off into separate tribes; and they do so without any reliance on pointless infighting or dominance hierarchies.
My own parents went with: ‘The kids take on the father’s surname, but the mother’s name stays unchanged.’ In some ways that’s progress, but it’s still sexist and awkward. It means my mother’s forever cut off from the rest of the family. It means we can’t all rally together under one banner, lest we incur dissonance. It’s a small thing, but some small things matter.
Objection 7: Your system requires partners to come to an agreement on challenging, highly personal issues with many degrees of freedom. That’s a recipe for disaster.
Response: It’s true that union names demand some maturity and willingness to compromise in order to work. I don’t think that’s a bad thing. The alternative is to make our naming conventions unequal (so one person gets final say) or arbitrary (so nobody gets final say).
That said, if two partners are completely unable to agree on a single name, they can still fall back on creating a union name that’s a hyphenated version of their two top choices. This may not be ideal, but it’s one of a variety of compromises the system allows. And since it gets replaced by the next generation’s union name (rather than merged with it), it doesn’t run into the problem of accumulating more and more names over time, and doesn’t become unmanageably large.
Objection 8: What about single parents?
Response: For simplicity’s sake, let’s assume a parent who has never been in any unions. (Though if ey has, that doesn’t solve this problem; you probably don’t want to falsely suggest that your child is in the same family as an unrelated ex of yours.) So the parent’s name is A B.
The simplest answer would be to just name the child C B, like most English speakers do today. But that will introduce confusion, because — assuming siblings are more common than single parents in this union-name-using community — people will initially think that A B and C B are siblings, rather than parent and child.
So I recommend sticking with the union system, and having the parent make up a new name D, change eir name to A D B, and name the child C D.
This has the advantage of allowing you to later ‘adopt’ a spouse Y Z into the same union — say, if you marry someone when the kid is still very young who ends up acting as the child’s caregiver. That new spouse will then take on the middle name D, becoming Y D Z.
If you tried to ‘adopt’ someone into you and your child’s family name without constructing a new union name, then you’d end up having to either: (a) look silly by doubling your own name and becoming A B B to match your spouse Y B Z; (b) look like your spouse’s child by remaining A B and having your spouse become Y B Z; or (c) have your spouse completely change surnames to Y B or Y Z B, which abandons the union name system and all its special advantages.
Just sticking to the union system in all cases seems easier, once it’s well-established. A family with one parent is just as real a family as any other, and deserves just as much to be commemorated with whatever rituals a society uses to honor familial ties.
Objection 9: Your system doesn’t allow traditionalists to pass on the torch of their name with any staying power. All trace of our names will be erased within two generations. That means that legacy names like ‘John Jones VII’ aren’t just discouraged; they’re impossible.
Response: This is true, but I’m not sure it matters very much. Names should be first and foremost about the individuals named. If those names refer to some historical event or lineage, that should be because the lineage is of unusual personal significance to the individual, not because the individual has been pressured into conforming to an arbitrary tradition. It’s a good thing if union names encourage people to construct their own identities as they build their deepest personal bonds and carry out the project of their lives, rather than encouraging people to base their identities primarily on the echoes and expectations of distant ancestors.
That said, union names don’t forbid ancestral naming traditions. If you really want to preserve your name across two or more generations, you can use an alternating system: Sam Boutros Ghali can beget Uma Ghali Boutros, who begets Shashi Boutros Ghali…. You’d just need to start families with people willing to take on one of your traditional names.
As for the impossibility of giving your child your exact name under this system… that’s definitely a feature, not a bug. Union names are a relatively poor choice if domineering creepiness or ambiguity are the things you want from your naming system.
Objection 10: But doesn’t that just reintroduce the problem of one partner getting to impose eir will on the other?
Response: Yes. This will be possible on any nonrandom system. Selfishness and inequality happen in relationships. Union names don’t make it impossible for partners to pressure each other into things they don’t want to do. Union names just make inegalitarian solutions unnecessary, and make the products of name negotiations more interesting and meaningful.
Objection 11: How do we tell ordinary middle names apart from union names?
Response: Well, we could stop giving children middle names so much. If we unambiguously use them only for unions, then we have a very convenient way of knowing people’s relationship history at a glance. Perhaps most people will be satisfied expressing their naming ideas through union names themselves, shrinking the desire for other bonus names.
Then again, maybe some ambiguity is good. Middle names add noise that creates a bit more privacy for people.
Another solution is to have an optional convention for marking the transition from personal names to union / sur-names in one’s full name. For instance, although this wouldn’t be required, you could inject the word ‘of’ before the first union name, if you really want to be clear about your name’s meaning. If personal middle names ever die out, though, this convention should die with it.
Objection 12: Union names make people think of their identities as tied to their partners’ and children’s identities. That’s unhealthy and/or unrealistic.
Response: I disagree. Our identities are tied to our loved ones. They shape our experiences, and draw out of us a specific persona. Both of those factors affect our personality on a deep level. It’s healthy to have some space from one’s family, but it’s also healthy to recognize how indebted we are to our friends, family, and community for who we are.
Hiding from your environment is not rediscovering what’s Authentically You; it’s refusing to acknowledge the part of the Authentic You that’s ineradicably bound up in the outside world.
Objection 13: Union names give parents total control over their children’s names, and very little control over their own names. The reverse makes far more sense. Children should pick their names as a rite of passage, reinforcing their autonomy and self-determination and discouraging parents from thinking of their children as possessions or works of art.
Response: This is a good objection! I do worry about all naming systems that simply impose the parents’ will on the next generation. Children should have a say in their identity — by default, not just if they go out of their way to buck social pressure. But they also need to be called something before they’re old enough to self-name. Some sort of compromise is needed.
My personal suggestion is to encourage children to legally change their first name when they reach a certain age. If this coming-of-age ritual generally leaves the surname intact, then it will remain consistent with the union name system.
I’ll keep expanding the above list as people keep having new ideas!
Most good people are kind in an ordinary way, when the intensity of human suffering in the world today calls for heroic kindness. I’ve seen ordinary kindness criticized as “pretending to try”. We go through the motions of humanism, but without significantly inconveniencing ourselves, without straying from our established habits, without violating societal expectations. It’s not that we’re being deliberately deceitful; it’s just that our stated values are in conflict with the lack of urgency revealed in our behaviors. If we want to see real results, we need to put more effort than that into helping others.
The Effective Altruism movement claims to have made some large strides in the direction of “actually trying”, approaching our humanitarian problems with fresh eyes and exerting a serious effort to solve them. But Ben Kuhn has criticized EA for spending more time “pretending to actually try” than “actually trying”. Have we become more heroic in our compassion, or have we just become better at faking moral urgency?
I agree with his criticism, though I’m not sure how large and entrenched the problem is. I bring it up in order to address a reply by Katja Grace. Katja wrote ‘In praise of pretending to really try‘, granting Ben’s criticism but arguing that the phenomenon he’s pointing to is a good thing.
“Effective Altruism should not shy away from pretending to try. It should strive to pretend to really try more convincingly, rather than striving to really try.
“Why is this? Because Effective Altruism is a community, and the thing communities do well is modulating individual behavior through interactions with others in the community. Most actions a person takes as a result of being part of a community are pretty much going to be ‘pretending to try’ by construction. And such actions are worth having.”
If I’m understanding Katja’s argument right, it’s: ‘People who pretend to try are motivated by a desire for esteem. And what binds a community together is in large part this desire for esteem. So we can’t get rid of pretending to try, or we’ll get rid of what makes Effective Altruism a functional community in the first place.’
The main problem here is in the leap from ‘if you pretend to try, then you’re motivated by a desire for esteem’ to ‘if you’re motivated by a desire for esteem, then you’re pretending to try’. Lo:
“A community of people not motivated by others seeing and appreciating their behavior, not concerned for whether they look like a real community member, and not modeling their behavior on the visible aspects of others’ behavior in the community would generally not be much of a community, and I think would do less well at pursuing their shared goals. [...]
“If people heed your call to ‘really try’ and do the ‘really trying’ things you suggest, this will have been motivated by your criticisms, so seems more like a better quality of pretending to really try, than really trying itself. Unless your social pressure somehow pressured them to stop being motivated by social pressure.”
The idea of ‘really trying’ isn’t ‘don’t be influenced by social pressure’. It’s closer to ‘whatever, be influenced by social pressure however you want — whatever it takes! — as long as you end up actually working on the tasks that matter’. Signaling (especially honest signaling) and conformity (especially productive conformism) are not the enemy. The enemy is waste, destruction, human misery.
The ‘Altruism’ in ‘Effective Altruism’ is first and foremost a behavior, not a motivation. You can be a perfectly selfish Effective Altruist, as long as you’ve decided that your own interests are tied to others’ welfare. So in questioning whether self-described Effective Altruists are living up to their ideals, we’re primarily questioning whether they’re acting the part. Whether their motives are pure doesn’t really matter, except as a device for explaining why they are or aren’t actively making the world a better place.
“I don’t mean to say that ‘really trying’ is bad, or not a good goal for an individual person. But it is a hard goal for a community to usefully and truthfully have for many of its members, when so much of its power relies on people watching their neighbors and working to fit in.”
To my ear, this sounds like: ‘Being a good fireman is much, much harder than looking like a good fireman. And firemen are important, and their group cohesion and influence depends to a significant extent on their being seen as good firemen. So we shouldn’t chastise firemen who sacrifice being any good at their job for the sake of looking as though they’re good at their job. We should esteem them alongside good firemen, albeit with less enthusiasm.’
I don’t get it. If there are urgent Effective Altruism projects, then surely we should be primarily worried about how much real-world progress is being made on those projects. Building a strong, thriving EA community isn’t particularly valuable if the only major outcome is that we perpetuate EA, thereby allowing us to further perpetuate EA…
I suppose this strategy makes sense if it’s easier to just focus on building the EA movement and waiting for a new agenty altruist to wander in by chance, than it is to increase the agentiness of people currently in EA. But that seems unlikely to me. It’s harder to find ‘natural’ agents than it is to create or enhance them. And if we allow EA to rot from within and become an overt status competition with few aspirations to anything higher, then I’d expect us to end up driving away the real agents and true altruists. The most sustainable way to attract effective humanists is to be genuinely effective and genuinely humanistic, in a visible way.
At some point, the buck has to stop. At some point, someone has to actually do the work of EA. Why not now?
A last point: I think an essential element of ‘pretending to (actually) try’ is being neglected here. If I’m understanding how people think, pretending to try is at least as much about self-deception as it is about signaling to others. It’s a way of persuading yourself that you’re a good person, of building a internal narrative you can be happy with. The alternative is that the pretenders are knowingly deceiving others, which sounds a bit too Machiavellian to me to fit my model of realistic psychology.
But if pretending to try requires self-deception, then what are Katja and Ben doing? They’re both making self-deception a lot harder. They’re both writing posts that will make their EA readers more self-aware and self-critical. On my model, that means that they’re both making it tougher to pretend to try. (As am I.)
But if that’s so, then Ben’s strategy is wiser. Reading Ben’s critique, a pretender is encouraged to switch to actually trying. Reading Katja’s, pretenders are still beset with dissonance, but now without any inspiring call to self-improvement. The clearest way out will then be to give up on pretending to try, and give up on trying.
I’m all for faking it till you make it. But I think that faking it transitions into making it, and avoids becoming a lost purpose, in part because we continue to pressure people to live lives more consonant with their ideals. We should keep criticizing hypocrisy and sloth. But the criticism should look like ‘we can do so much better!’, not ‘let us hunt down all the Fakers and drive them from our midst!’.
It’s exciting to realize that so much of what we presently do is thoughtless posturing. Not because any of us should be content with ‘pretending to actually try’, but because it means that a small shift in how we do things might have a big impact on how effective we are.
Imagine waking up tomorrow, getting out of bed, and proceeding to do exactly the sorts of things you think are needed to bring about a better world.What would that be like?