nothing is mere

Inhuman altruism: Inferential gap, or motivational gap?

Brienne recently wrote that most LessWrongers and effective altruists eat meat because they haven’t yet been persuaded that non-human animals can experience suffering:

Vegans: If the meat eaters believed what you did about animal sentience, most of them would be vegans, and they would be horrified by their many previous murders. Your heart-wrenching videos aren’t convincing to them because they aren’t already convinced that animals can feel.

Meat-eaters: Vegans think there are billions of times more people on this planet than you do, they believe you’re eating a lot of those people, and they care about every one of them the way you care about every human. […]

Finally, let me tell you about what happens when you post a heart-wrenching video of apparent animal suffering: It works, if the thing you’re trying to do is make me feel terrible. My brain anthropomorphizes everything at the slightest provocation. Pigs, cows, chickens, mollusks, worms, bacteria, frozen vegetables, and even rocks. And since I know that it’s quite easy to get me to deeply empathize with a pet rock, I know better than to take those feelings as evidence that the apparently suffering thing is in fact suffering. If you posted videos of carrots in factory farms and used the same phrases to describe their miserable lives and how it’s all my fault for making the world this terrible place where oodles of carrots are murdered constantly, I’d feel the same way. So these arguments do not tend to be revelatory of truth.

I’ve argued before that non-human animals’ abilities to self-monitor, learn, collaborate, play, etc. aren’t clear evidence that they have a subjective, valenced point of view on the world. Until we’re confident we know what specific physical behaviors ‘having a subjective point of view’ evolved to produce — what cognitive problem phenomenal consciousness solves — we can’t confidently infer consciousness from the overt behaviors of infants, non-human animals, advanced AI, anesthetized humans, etc.

[I]f you work on AI, and have an intuition that a huge variety of systems can act ‘intelligently’, you may doubt that the linkage between human-style consciousness and intelligence is all that strong. If you think it’s easy to build a robot that passes various Turing tests without having full-fledged first-person experience, you’ll also probably (for much the same reason) expect a lot of non-human species to arrive at strategies for intelligently planning, generalizing, exploring, etc. without invoking consciousness. (Especially if [you think consciousness is very complex]. Evolution won’t put in the effort to make a brain conscious unless it’s extremely necessary for some reproductive advantage.)

 

That said, I don’t think any of this is even superficially an adequate justification for torturing, killing, and eating human infants, intelligent aliens, or cattle.

 

 

The intellectual case against meat-eating is pretty air-tight

To argue from ‘we don’t understand the cognitive basis for consciousness’ to ‘it’s OK to eat non-humans’ is acting as though our ignorance were positive knowledge we could confidently set down our weight on. Even if you have a specific cognitive model that predicts ‘there’s an 80% chance cattle can’t suffer,’ you have to be just as cautious as you’d be about torturing a 20%-likely-to-be-conscious person in a non-vegetative coma, or a 20%-likely-to-be-conscious alien. And that’s before factoring in your uncertainty about the arguments for your model.

The argument for not eating cattle, chickens, etc. is very simple:

1. An uncertainty-about-animals premise, e.g.: We don’t know enough about how cattle cognize, and about what kinds of cognition make things moral patients, to assign a less-than-1-in-20 subjective probability to ‘factory-farmed cattle undergo large quantities of something-morally-equivalent-to-suffering’.

2. An altruism-in-the-face-of-uncertainty premise, e.g.: You shouldn’t do things that have a 1-in-20 (or greater) chance of contributing to large amounts of suffering, unless the corresponding gain is huge. E.g., you shouldn’t accept $100 to flip a switch that 95% of the time does nothing and 5% of the time nonconsensually tortures an adult human for 20 minutes.

3. An eating-animals-doesn’t-have-enormous-benefits premise.

4. An eating-animals-is-causally-linked-to-factory-farming premise.

5. So don’t eat the animals in question.

This doesn’t require us to indulge in anthropomorphism or philosophical speculation. And Brienne’s updates to her post suggest she now agrees a lot of meat-eaters we know assign a non-negligible probability to ‘cattle can suffer’. (Also, kudos to Brienne on not only changing her mind about an emotionally fraught issue extremely rapidly, but also changing the original post. A lot of rationalists who are surprisingly excellent at updating their beliefs don’t seem to fully appreciate the value of updating the easy-to-Google public record of their beliefs to cut off the spread of falsehoods.)

This places intellectually honest meat-eating effective altruists in a position similar to Richard Dawkins':
 

 

[I’m] in a very difficult moral position. I think you have a very, very strong point when you say that anybody who eats meat has a very, very strong obligation to think seriously about it. And I don’t find any very good defense. I find myself in exactly the same position as you or I would have been — well, probably you wouldn’t have been, but I might have been — 200 years ago, talking about slavery. […T]here was a time when it was simply the norm. Everybody did it. Some people did it with gusto and relish; other people, like Jefferson, did it reluctantly. I would have probably done it reluctantly. I would have sort of just gone along with what society does. It was hard to defend then, yet everybody did it. And that’s the sort of position I find myself in now. […] I live in a society which is still massively speciesist. Intellectually I recognize that, but I go along with it the same way I go along with celebrating Christmas and singing Christmas carols.

Until I see solid counter-arguments — not just counter-arguments to ‘animals are very likely conscious,’ but to the much weaker formulation needed to justify veg(etari)anism — I’ll assume people are mostly eating meat because it’s tasty and convenient and accepted-in-polite-society, not because they’re morally indifferent to torturing puppies behind closed doors.

 

 

Why isn’t LessWrong extremely veg(etari)an?

On the face of it, LessWrong ought to be leading the pack in veg(etari)anism. A lot of LessWrong’s interests and values look like they should directly cash out in a concern for animal welfare:

transhumanism and science fiction: If you think aliens and robots and heavily modified posthumans can be moral patients, you should be more open to including other nonhumans in your circle of concern.

superrationality: Veg(etari)anism benefits from an ability to bind my future self to my commitments, and from a Kantian desire to act as I’d want other philosophically inclined people in my community to act.

probabilism: If you can reason with uncertainty and resist the need for cognitive closure, you’ll be more open to the uncertainty argument.

utilitarianism: Animals causes are admirably egalitarian and scope-sensitive.

taking ideas seriously: If you’re willing to accept inconvenient conclusions even when they’re based in abstract philosophy, that gives more power to theoretical arguments for worrying about animal cognition even if you can’t detect or imagine that cognition yourself.

distrusting the status quo: Veg(etari)anism remains fairly unpopular, and societal inertia is an obvious reason why.

distrusting ad-hoc intuitions: It may not feel desperately urgent to stop buying hot dogs, but you shouldn’t trust that intuition, because it’s self-serving and vulnerable to e.g. status quo bias. This is a lot of how LessWrong goes about ‘taking ideas seriously'; one should ‘shut up and multiply’ even when a conclusion is counter-intuitive.

Yet only about 15% of LessWrong is vegetarian (compared to 4-13% of the Anglophone world, depending on the survey). By comparison, the average ‘effective altruist’ LessWronger donated $2503 to charity in 2013; 9% of LessWrongers have been to a CFAR class; and 4% of LessWrongers are signed up for cryonics (and another 24% would like to be signed up). These are much larger changes relative to the general population, where maybe 1 in 150,000 people are signed up for cryonics.

I can think of a few reasons for the discrepancy:

(a) Cryonics, existential risk, and other LessWrong-associated ideas have techy, high-IQ associations, in terms of their content and in terms of the communities that primarily endorse them. They’re tribal markers, not just attempts to maximize expected utility; and veg(etari)ans are seen as belonging to other tribes, like progressive political activists and people who just want to hug every cat.

(b) Those popular topics have been strongly endorsed and argued for by multiple community leaders appealing to emotional language and vivid prose. It’s one thing to accept cryonics and vegetarianism as abstract arguments, and another thing to actually change your lifestyle based on the argument; the latter took a lot of active pushing and promotion. (The abstract argument is important; but it’s a necessary condition for action, not a sufficient one. You can’t just say ‘I’m someone who takes ideas seriously’ and magically stop reasoning motivatedly in all contexts.)

(c) Veg(etari)anism isn’t weird and obscure enough. If you successfully sign up for cryonics, LessWrong will treat you like an intellectual and rational elite, a rare person who actually thinks clearly and acts accordingly. If you successfully donate 10% of your income to GiveWell, ditto; even though distributing deworming pills isn’t sexy and futuristic, it’s obscure enough (and supported by enough community leaders, per (b)) that it allows you to successfully signal that you’re special. If 10% of the English-speaking world donated to GiveWell or were signed up for cryonics, my guess is that LessWrongers would be too bored by those topics to rush to sign up even if the cryonics and deworming organizations had scaled up in ways that made marginal dollars more effective. Maybe you’d get 20% to sign up for cryonics, but you wouldn’t get 50% or 90%.

(d) Changing your diet is harder than spending lots of money. Where LessWrongers excel, it’s generally via once-off or sporadic spending decisions that don’t have a big impact on your daily life. (‘Successfully employing CFAR techniques’ may be an exception to this rule, if it involves reinvesting effort every single day or permanently skipping out on things you enjoy; but I don’t know how many LessWrongers do that.)

If those hypotheses are right, it might be possible to shift LessWrong types more toward veganism by improving its status in the community and making the transition to veganism easier and less daunting.

 

 

What would make a transhumanist excited about this?

I’ll conclude with various ideas for bridging the motivation gap. Note that it doesn’t follow from ‘the gap is motivational’ that posting a bunch of videos of animal torture to LessWrong or the Effective Altruism Forum is the best way to stir people’s hearts. When intellectual achievement is what you trust and prize, you’re more likely to be moved to action by things that jibe with that part of your identity.

 
Write stunningly beautiful, rigorous, philosophically sophisticated things that are amazing and great

I’m not primarily thinking of writing really good arguments for veg(etari)anism; as I noted above, the argument is almost too clear-cut. It leaves very little to talk about in any detail, especially if we want something that hasn’t been discussed to death on LessWrong before. However, there are still topics in the vicinity to address, such as ‘What is the current state of the evidence about the nutrition of veg(etari)an diets?’ Use Slate Star Codex as a model, and do your very best to actually portray the state of the evidence, including devoting plenty of attention to any ways veg(etari)an diets might turn out to be unhealthy. (EDIT: Soylent is popular with this demographic and is switching to a vegan recipe, so it might be especially useful to evaluate its nutritional completeness and promote a supplemented Soylent diet.)

In the long run you’ll score more points by demonstrating how epistemically rational and even-handed you are than by making any object-level argument for veg(etari)anism. Not only will you thereby find out more about whether you’re wrong, but you’ll convince rationalists to take these ideas more seriously than if you gave a more one-sided argument in favor of a policy.

Fiction, done right, can serve a similar function. I could imagine someone writing a sci-fi story set in a future where humans have evolved into wildly different species with different perceived rights, thus translating animal welfare questions into a transhumanist idiom.

Just as the biggest risk with a blog post is of being too one-sided, the biggest risk with a story is of being too didactic and persuasion-focused. The goal is not to construct heavy-handed allegories; the goal is to make an actually good story, with moral conflicts you’re genuinely unsure about. Make things that would be worth reading even if you were completely wrong about animal ethics, and as a side-effect you’ll get people interested in the science, the philosophy, and the pragmatics of related causes.

 
Be positive and concrete

Frame animal welfare activism as an astonishingly promising, efficient, and uncrowded opportunity to do good. Scale back moral condemnation and guilt. LessWrong types can be powerful allies, but the way to get them on board is to give them opportunities to feel like munchkins with rare secret insights, not like latecomers to a not-particularly-fun party who have to play catch-up to avoid getting yelled at. It’s fine to frame helping animals as challenging, but the challenge should be to excel and do something astonishing, not to meet a bare standard for decency.

This doesn’t necessarily mean lowering your standards; if you actually demand more of LessWrongers and effective altruists than you do of ordinary people, you’ll probably do better than if you shot for parity. If you want to change minds in a big way, think like Berwick in this anecdote from Switch:

In 2004, Donald Berwick, a doctor and the CEO of the Institute for Healthcare Improvement (IHI), had some ideas about how to save lives—massive numbers of lives. Researchers at the IHI had analyzed patient care with the kinds of analytical tools used to assess the quality of cars coming off a production line. They discovered that the ‘defect’ rate in health care was as high as 1 in 10—meaning, for example, that 10 percent of patients did not receive their antibiotics in the specified time. This was a shockingly high defect rate—many other industries had managed to achieve performance at levels of 1 error in 1,000 cases (and often far better). Berwick knew that the high medical defect rate meant that tens of thousands of patients were dying every year, unnecessarily.

Berwick’s insight was that hospitals could benefit from the same kinds of rigorous process improvements that had worked in other industries. Couldn’t a transplant operation be ‘produced’ as consistently and flawlessly as a Toyota Camry?

Berwick’s ideas were so well supported by research that they were essentially indisputable, yet little was happening. He certainly had no ability to force any changes on the industry. IHI had only seventy-five employees. But Berwick wasn’t deterred.

On December 14, 2004, he gave a speech to a room full of hospital administrators at a large industry convention. He said, ‘Here is what I think we should do. I think we should save 100,000 lives. And I think we should do that by June 14, 2006—18 months from today. Some is not a number; soon is not a time. Here’s the number: 100,000. Here’s the time: June 14, 2006—9 a.m.’

The crowd was astonished. The goal was daunting. But Berwick was quite serious about his intentions. He and his tiny team set out to do the impossible.

IHI proposed six very specific interventions to save lives. For instance, one asked hospitals to adopt a set of proven procedures for managing patients on ventilators, to prevent them from getting pneumonia, a common cause of unnecessary death. (One of the procedures called for a patient’s head to be elevated between 30 and 45 degrees, so that oral secretions couldn’t get into the windpipe.)

Of course, all hospital administrators agreed with the goal to save lives, but the road to that goal was filled with obstacles. For one thing, for a hospital to reduce its ‘defect rate,’ it had to acknowledge having a defect rate. In other words, it had to admit that some patients were dying needless deaths. Hospital lawyers were not keen to put this admission on record.

Berwick knew he had to address the hospitals’ squeamishness about admitting error. At his December 14 speech, he was joined by the mother of a girl who’d been killed by a medical error. She said, ‘I’m a little speechless, and I’m a little sad, because I know that if this campaign had been in place four or five years ago, that Josie would be fine…. But, I’m happy, I’m thrilled to be part of this, because I know you can do it, because you have to do it.’ Another guest on stage, the chair of the North Carolina State Hospital Association, said: ‘An awful lot of people for a long time have had their heads in the sand on this issue, and it’s time to do the right thing. It’s as simple as that.’

IHI made joining the campaign easy: It required only a one-page form signed by a hospital CEO. By two months after Berwick’s speech, over a thousand hospitals had enrolled. Once a hospital enrolled, the IHI team helped the hospital embrace the new interventions. Team members provided research, step-by-step instruction guides, and training. They arranged conference calls for hospital leaders to share their victories and struggles with one another. They encouraged hospitals with early successes to become ‘mentors’ to hospitals just joining the campaign.

The friction in the system was substantial. Adopting the IHI interventions required hospitals to overcome decades’ worth of habits and routines. Many doctors were irritated by the new procedures, which they perceived as constricting. But the adopting hospitals were seeing dramatic results, and their visible successes attracted more hospitals to join the campaign.

Eighteen months later, at the exact moment he’d promised to return—June 14, 2006, at 9 a.m.—Berwick took the stage again to announce the results: ‘Hospitals enrolled in the 100,000 Lives Campaign have collectively prevented an estimated 122,300 avoidable deaths and, as importantly, have begun to institutionalize new standards of care that will continue to save lives and improve health outcomes into the future.’

The crowd was euphoric. Don Berwick, with his 75-person team at IHI, had convinced thousands of hospitals to change their behavior, and collectively, they’d saved 122,300 lives—the equivalent of throwing a life preserver to every man, woman, and child in Ann Arbor, Michigan.

This outcome was the fulfillment of the vision Berwick had articulated as he closed his speech eighteen months earlier, about how the world would look when hospitals achieved the 100,000 lives goal:

‘And, we will celebrate. Starting with pizza, and ending with champagne. We will celebrate the importance of what we have undertaken to do, the courage of honesty, the joy of companionship, the cleverness of a field operation, and the results we will achieve. We will celebrate ourselves, because the patients whose lives we save cannot join us, because their names can never be known. Our contribution will be what did not happen to them. And, though they are unknown, we will know that mothers and fathers are at graduations and weddings they would have missed, and that grandchildren will know grandparents they might never have known, and holidays will be taken, and work completed, and books read, and symphonies heard, and gardens tended that, without our work, would have been only beds of weeds.’

As an added bonus, emphasizing excellence and achievement over guilt and wickedness can decrease the odds that you’ll make people feel hounded or ostracized for not immediately going vegan. I expressed this worry in Virtue, Public and Private, e.g., for people with eating disorders that restrict their dietary choices. This is also an area where ‘just be nice to people’ is surprisingly effective.

If you want to propagate a modest benchmark, consider: “After every meal where you eat an animal, donate $1 to the Humane League.” Seems like a useful way to bootstrap toward veg(etari)anism, and it fits the mix of economic mindfulness and virtue cultivation that a lot of rationalists find appealing. This sort of benchmark is forgiving without being shapeless or toothless. If you want to propagate an audacious vision for the future, consider: “There were 1200 meat-eaters on LessWrong in the 2013 survey; if we could get them to consume 30% less meat from land animals over the next 10 years, we could prevent 100,000 deaths (mostly chickens). Let’s shoot for that.” Combining an audacious vision with a simple, actionable policy should get the best results.

 
Embrace weird philosophies

Here’s an example of the special flavor LessWrong-style animal activism could develop:

Are there any animal welfare groups that emphasize the abyssal otherness of the nonhuman mind? That talk about the impossible dance, the catastrophe of shapeless silence that lies behind a cute puppy dog’s eyes? As opposed to talking about how ‘sad’ or ‘loving’ the puppies are?

I think I’d have a much, much easier time talking about the moral urgency of animal suffering without my Anthropomorphism Alarms going off if I were part of a community like ‘Lovecraftians for the Ethical Treatment of Animals’.

This is philosophically sound and very relevant, since our uncertainty about animal cognition is our best reason to worry about their welfare. (This is especially true when we consider the possibility that non-humans might suffer more than any human can.) And, contrary to popular misconceptions, the Lovecraftian perspective is more about profound otherness than about nightmarish evil. Rejecting anthropomorphism makes the case for veg(etari)anism stronger; and adopting that sort of emotional distance, paradoxically, is the only way to get LessWrong types interested and the only way to build trust.

Yet when I expressed an interest in this nonstandard perspective on animal well-being, I got responses from effective animal altruists like (paraphrasing):

  • ‘Your endorsement of Lovecraftian animal rights sounds like an attack on animal rights; so here’s my defense of the importance of animal rights…’
  • ‘No, viewing animal psychology as alien and unknown is scientifically absurd. We know for a fact that dogs and chickens experience human-style suffering. (David Pearce adds: Also lampreys!)’
  • ‘That’s speciesist!’

Confidence about animal psychology (in the direction of ‘it’s relevantly human-like’) and extreme uncertainty about animal psychology can both justify prioritizing animal welfare; but when you’re primarily accustomed to seeing uncertainty about animal psychology used as a rationalization for neglecting animals, it will take increasing amounts of effort to keep the policy proposal and the question-of-fact mentally distinct. Encourage more conceptual diversity and pursue more lines of questioning for their own sake, and you end up with a community that’s able to benefit more from cross-pollination with transhumanists and mainline effective altruists and, further, one that’s epistemically healthier.

What techniques would you love to suddenly acquire?

I’ve been going to Val’s rationality dojo for CFAR workshop alumni, and I found a kind-of-similar-to-this exercise useful:

  • List a bunch of mental motions — situational responses, habits, personality traits — you wish you could possess or access at will. Visualize small things you imagine would be different about you if you were making more progress toward your goals.
  • Make these skills things you could in principle just start doing right now, like ‘when my piano teacher shuts the door at the end of our weekly lessons, I’ll suddenly find it easy to install specific if-then triggers for what times I’ll practice piano that week.’ Or ‘I’ll become a superpowered If-Then Robot, the kind of person who always thinks to use if-then triggers when she needs to keep up with a specific task.’ Not so much ‘I suddenly become a piano virtuoso’ or ‘I am impervious to projectile weapons’.
  • Optionally, think about a name or visualization that would make you personally excited and happy to think and talk about the virtuous disposition you desire. For example, when I think about the feeling of investing in a long-term goal in a manageable, realistic way, one association that spring to mind for me is the word healthy. I also visualize a solid forward motion, with my friends and life-as-a-whole relaxedly keeping pace. If I want to frame this habit as a Powerful Technique, maybe I’ll call it ‘Healthiness-jutsu’.

Here’s a grab bag of other things I’d like to start being better at:

1. casual responsibility – Freely and easily noticing and attending to my errors, faults, and obligations, without melodrama. Keeping my responsibility in view without making a big deal about it, beating myself up, or seeking a Grand Resolution. Just, ‘Yup, those are some of the things on the List. They matter. Next question?’

2. rigorous physical gentleness – My lower back is recovering from surgery. I need to consistently work to incrementally strengthen it, while being very careful not to overdo it. Often this means avoiding fun strenuous exercise, which can cause me to start telling frailty narratives to myself and psych myself out of relatively boring-but-sustainable exercise. So I’m mentally combining the idea of a boot camp with the idea of a luxurious spa: I need to be militaristic and zealous about always pampering and caring for and moderately-enhancing myself, without fail, dammit. It takes grit to be that patient and precise and non-self-destructive.

3. tsuyoku naritai – I am the naive but tenacious-and-hard-working protagonist-with-an-aura-of-destiny in a serial. I’ll face foes beyond my power — cinematic obstacles, yielding interesting, surprising failures — and I’ll learn, and grow. My journey is just beginning. I will become stronger.

4. trust – Disposing of one of my biggest practical obstacles to tsuyoku naritai. Feeling comfortable losing; feeling safe and luminous about vulnerability. Building five-second habits and social ties that make growth-mindset weakness-showing normal.

5. outcome pumping – “What you actually end up doing screens off the clever reason why you’re doing it.” Past a certain point, it just doesn’t matter exactly why or exactly how; it matters what. If I somehow find myself studying mathematics for 25 minutes a day over four months. and that is hugely rewarding, it’s almost beside the point what cognitive process I used to get there. I don’t need to have a big cause or justification for doing the awesome thing; I can just do it. Right now, in fact.

6. do the thing – Where outcome pumping is about ‘get it done and who cares about method’, I associate thing-doing with ‘once I have a plan/method/rule, do that. Follow though.’ You did the thing yesterday? Good. Do the thing today. Thing waits for no man. — — — You’re too [predicate]ish or [adjective]some to do the thing? That’s perfectly fine. Go do the thing.

When I try to visualize a shiny badass hybrid Competence Monster with all of these superpowers, I get something that looks like this. Your memetico-motivational mileage may vary.

7. sword of clear sight – Inner bullshit detector, motivated stopping piercer, etc. A thin blade cleanly divorces my person from unhealthy or not-reflectively-endorsed inner monologues. Martial arts metaphors don’t always work for me, but here they definitely feel right.

8. ferocity – STRIKE right through the obstacle. Roar. Spit fire, smash things, surge ahead. A whipping motion — a sudden SPIKE in focused agency — YES. — YES, IT MUST BE THAT TIME AGAIN. CAPS LOCK FEELS UNBELIEVABLY APPROPRIATE. … LET’S DO THIS.

9. easy response – With a sense of lightness and fluid motion-right-to-completion, immediately execute each small task as it arises. Breathe as normal. No need for a to-do list or burdensome juggling act; with no particular fuss or exertion, it is already done.

10. revisit the mountain – Take a break to look at the big picture. Ponder your vision for the future. Write blog posts like this. I’m the kind of person who benefits a lot from periodically looking back over how I’m doing and coming up with handy new narratives.

These particular examples probably won’t match your own mental associations and goals. I’d like to see your ideas; and feel free to steal from and ruthlessly alter entries on my own or others’ lists!

Virtue, public and private

Effective altruists have been discussing animal welfare rather a lot lately, on a few different levels:

1. object-level: How likely is it that conventional food animals suffer?

2. philanthropic: Compared to other causes, how important is non-human animal welfare? How effective are existing organizations and programs in this area? Should effective altruists concentrate attention and resources here?

3. personal-norm: Is it morally acceptable for an individual to use animal products? How important is it to become a vegetarian or vegan?

4. group-norm: Should effective altruist meetings and conventions serve non-vegan food? Should the effective altruist movement rally to laud vegans and/or try to make all effective altruists go vegan?

These questions are all linked, but I’ll mostly focus on 4. For catered EA events, I think it makes sense to default to vegan food whenever feasible, and order other dishes only if particular individuals request them. I’m not a vegan myself, but I think this sends a positive message — that we respect the strength of vegans’ arguments, and the large stakes if they’re right, more than we care about non-vegans’ mild aesthetic preferences.

My views about trying to make as many EAs as possible go vegan are more complicated. As a demonstration of personal virtue, I’d put ‘become a vegan’ in the same (very rough) category as:

  • have no carbon footprint.
  • buy no product whose construction involved serious exploitation of labor.
  • give 10+% of your income to a worthy cause.
  • avoid lifestyle choices that have an unsustainable impact on marine life.
  • only use antibiotics as a last (or almost-last) resort, so as not to contribute to antibiotic resistance.
  • do your best to start a career in effective altruism.

Arguments could be made that many of these are morally obligatory for nearly all people. And most people dismiss these policies too hastily, overestimating the action’s difficulty and underestimating its urgency. Yet, all the same, I’m not confident any of these is universally obligatory — and I’m confident that it’s not a good idea to issue blanket condemnations of everyone who fails to live up to some or all of the above standards, nor to make these actions minimal conditions for respectable involvement in EA.

People with eating disorders can have good grounds for not immediately going vegan. Immunocompromised people can have good grounds for erring on the side of overusing medicine. People trying to dig their way out of debt while paying for a loved one’s medical bills can have good grounds not to give to charity every year.

The deeper problem with treating these as universal Standards of Basic Decency in our community isn’t that we’d be imposing an unreasonable demand on people. It’s that we’d be forcing lots of people to disclose very sensitive details about their personal lives to a bunch of strangers or to the public Internet — physical disabilities, mental disabilities, personal tragedies, intense aversions…. Putting people into a tight spot is a terrible way to get them on board with any of the above proposals, and it’s a great way to make people feel hounded and unsafe in their social circles.

No one’s suggested casting all non-vegans out of our midst. I have, however, heard recent complaints from people who have disabilities that make it unusually difficult to meet some of the above Standards, and who have become less enthusiastic about EA as a result of feeling socially pressured or harangued by EAs to immediately restructure their personal lives. So I think this is something to be aware of and nip in the bud.

In principle, there’s no crisp distinction between ‘personal life’ and ‘EA activities’. There may be lots of private details about a person’s life that would constitute valuable Bayesian evidence about their character, and there may be lots of private activities whose humanitarian impact over a lifetime adds up to be quite large.

Even taking that into account, we should adopt (quasi-)deontic heuristics like ‘don’t pressure people into disclosing a lot about their spending, eating, etc. habits.’ Ends don’t justify means among humans. For the sake of maximizing expected utility, lean toward not jabbing too much at people’s boundaries, and not making it hard for them to have separate private and public lives — even for the sake of maximizing expected utility.


Edit (9/1): Mason Hartman gave the following criticism of this post:

I think putting people into a tight spot is not only not a terrible way to get people on board with veganism, but basically the only way to make a vegan of anyone who hasn’t already become one on their own by 18. Most people like eating meat and would prefer not to be persuaded to stop doing it. Many more people are aware of the factory-like reality of agriculture in 2014 than are vegans. Quietly making the information available to those who seek it out is the polite strategy, but I don’t think it’s anywhere near the most effective one. I’m not necessarily saying we should trade social comfort for greater efficacy re: animal activism, but this article disappoints in that it doesn’t even acknowledge that there is a tradeoff.

Also, all of our Standards of Basic Decency put an “unreasonable demand” (as defined in Robby’s post) on some people. All of them. That doesn’t necessarily mean we’ve made the wrong decision by having them.

In reply: The strategy that works best for public outreach won’t always be best for friends and collaborators, and it’s the latter I’m talking about. I find it a lot more plausible that open condemnation and aggressive uses of social pressure work well for strangers on the street than that they work well for coworkers, romantic partners, etc. (And I’m pretty optimistic that there are more reliable ways to change the behavior of the latter sorts of people, even when they’re past age 18.)

It’s appropriate to have a different set of norms for people you regularly interact with, assuming it’s a good idea to preserve those relationships. This is especially true when groups and relationships involve complicated personal and professional dynamics. I focused on effective altruism because it’s the sort of community that could be valuable, from an animal-welfare perspective, even if a significant portion of the community makes bad consumer decisions. That makes it likelier that we could agree on some shared group norms even if we don’t yet agree on the same set of philanthropic or individual norms.

I’m not arguing that you shouldn’t try to make all EAs vegans, or get all EAs to give 10+% of their income to charity, or make EAs’ purchasing decisions more labor- or environment-friendly in other respects. At this point I’m just raising a worry that should constrain how we pursue those goals, and hopefully lead to new ideas about how we should promote ‘private’ virtue. I’d expect strategies that are very sensitive to EAs’ privacy and boundaries to work better, in that I’d expect them to make it easier for a diverse community of researchers and philanthropists to grow in size, to grow in trust, to reason together, to progressively alter habits and beliefs, and to get some important work done even when there are serious lingering disagreements within the community.

Loosemore on AI safety and attractors

Richard Loosemore recently wrote an essay criticizing worries about AI safety, “The Maverick Nanny with a Dopamine Drip“. (Subtitle: “Debunking Fallacies in the Theory of AI Motivation”.) His argument has two parts. First:

1. Any AI system that’s smart enough to pose a large risk will be smart enough to understand human intentions, and smart enough to rewrite itself to conform to those intentions.

2. Any such AI will be motivated to edit itself and remove ‘errors’ from its own code. (‘Errors’ is a large category, one that includes all mismatches with programmer intentions.)

3. So any AI system that’s smart enough to pose a large risk will be motivated to spontaneously overwrite its utility function to value whatever humans value.

4. Therefore any powerful AGI will be fully safe / friendly, no matter how it’s designed.

Second:

5. Logical AI is brittle and inefficient.

6. Neural-network-inspired AI works better, and we know it’s possible, because it works for humans.

7. Therefore, if we want a domain-general problem-solving machine, we should move forward on Loosemore’s proposal, called ‘swarm relaxation intelligence.’

Combining these two conclusions, we get:

8. Since AI is completely safe — any mistakes we make will be fixed automatically by the AI itself — there’s no reason to devote resources to safety engineering. Instead, we should work as quickly as possible to train smarter and smarter neural networks. As they get smarter, they’ll get better at self-regulation and make fewer mistakes, with the result that accidents and moral errors will become decreasingly likely.

I’m not persuaded by Loosemore’s case for point 2, and this makes me doubt claims 3, 4, and 8. I’ll also talk a little about the plausibility and relevance of his other suggestions.

 

Does intelligence entail docility?

Loosemore’s claim (also made in an older essay, “The Fallacy of Dumb Superintelligence“) is that an AGI can’t simultaneously be intelligent enough to pose a serious risk, but “unsophisticated” enough to disregard its programmers’ intentions. I replied last year in two blog posts (crossposted to Less Wrong).

In “The AI Knows, But Doesn’t Care” I noted that while Loosemore posits an AGI smart enough to correctly interpret natural language and model human motivation, this doesn’t bridge the gap between the ability to perform a task and the motivation, the agent’s decision criteria. In “The Seed is Not the Superintelligence,” I argued, concerning recursively self-improving AI (seed AI):

When you write the seed’s utility function, you, the programmer, don’t understand everything about the nature of human value or meaning. That imperfect understanding remains the causal basis of the fully-grown superintelligence’s actions, long after it’s become smart enough to fully understand our values.

Why is the superintelligence, if it’s so clever, stuck with whatever meta-ethically dumb-as-dirt utility function we gave it at the outset? Why can’t we just pass the fully-grown superintelligence the buck by instilling in the seed the instruction: ‘When you’re smart enough to understand Friendliness Theory, ditch the values you started with and just self-modify to become Friendly.’?

Because that sentence has to actually be coded in to the AI, and when we do so, there’s no ghost in the machine to know exactly what we mean by ‘frend-lee-ness thee-ree’. Instead, we have to give it criteria we think are good indicators of Friendliness, so it’ll know what to self-modify toward.

My claim is that if we mess up on those indicators of friendliness — the criteria the AI-in-progress uses to care about (i.e., factor into its decisions) self-modification toward safety — then it won’t edit itself to care about those factors later, even if it’s figured out that that’s what we would have wanted (and that doing what we want is part of this ‘friendliness’ thing we failed to program it to value).

Loosemore discussed this with me on Less Wrong and on this blog, then went on to explain his view in more detail in the new essay. His new argument is that MIRI and other AGI theorists and forecasters think “AI is supposed to be hardwired with a Doctrine of Logical Infallibility,” meaning “it is incapable of considering the hypothesis that its own reasoning engine may not have taken it to a sensible place”.

Loosemore thinks that if we reject this doctrine, the AI will “understand that many of its more abstract logical atoms have a less than clear denotation or extension in the world”. In addition to recognizing that its reasoning process is fallible, it will recognize that its understanding of terms is fallible and revisable. This includes terms in its representation of its own goals; so the AI will improve its understanding of what it values over time. Since its programmers’ intention was for the AI to have a positive impact on the world, the AI will increasingly come to understand this fact about its values, and will revise its policies to match its (improved interpretation of its) values.

The main problem with this argument occurs at the phrase “understand this fact about its values”. The sentence starts by talking about the programmers’ values, yet it ends by calling this a fact about the AI’s values.

Consider a human trying to understand her parents’ food preferences. As she develops a better model of what her parents mean by ‘delicious,’ of their taste receptors and their behaviors, she doesn’t necessarily replace her own food preferences with her parents’. If her food choices do change as a result, there will need to be some added mechanism that’s responsible — e.g., she will need a specific goal like ‘modify myself to like what others do’.

We can make the point even stronger by considering minds that are alien to each other. If a human studies the preferences of a nautilus, she probably won’t acquire them. Likewise, a human who studies the ‘preferences’ (selection criteria) of an optimization process like natural selection needn’t suddenly abandon her own. It’s not an impossibility, but it depends on the human’s having a very specific set of prior values (e.g., an obsession with emulating animals or natural processes). For the same reason, most decision criteria a recursively self-improving AI could possess wouldn’t cause it to ditch its own values in favor of ours.

If no amount of insight into biology would make you want to steer clear of contraceptives and optimize purely for reproduction, why expect any amount of insight into human values to compel an AGI to abandon all its hopes and dreams and become a humanist? ‘We created you to help humanity!’ we might protest. Yet if evolution could cry out ‘I created you to reproduce!’, we would be neither rationally obliged nor psychologically impelled to comply. There isn’t any theorem of decision theory or probability theory saying ‘rational agents must promote the same sorts of outcomes as the processes that created them, else fail in formally defined tasks’.

 

Epistemic and instrumental fallibility v. moral fallibility

I don’t know of any actual AGI researcher who endorses Loosemore’s “Doctrine of Logical Infallibility”. (He equates Muehlhauser and Helm’s “Literalness” doctrine with Infallibility in passing, but the link isn’t clear to me, and I don’t see any argument for the identification. The Doctrine is otherwise uncited.) One of the main organizations he critiques, MIRI, actually specializes in researching formal agents that can’t trust their own reasoning, or can’t trust the reasoning of future versions of themselves. This includes work on logical uncertainty (briefly introduced here, at length here) and ’tiling’ self-modifying agents (here).

Loosemore imagines a programmer chiding an AI for the “design error” of pursuing human-harming goals. The human tells the AI that it should fix this error, since it fixed other errors in its code. But Loosemore is conflating programming errors the human makes with errors of reasoning the AI makes. He’s assuming unargued that flaws in an agent’s epistemic and instrumental rationality are of a kind with defects in its moral character or docility.

Any efficient goal-oriented system has convergent instrumental reasons to fix ‘errors of reasoning’ of the kind that are provably obstacles to its own goals. Bostrom discusses this in “The Superintelligent Will,” and Omohundro discusses it in “Rational Artificial Intelligence for the Greater Good,” under the name ‘Basic AI Drives’.

‘Errors of reasoning,’ in the relevant sense, aren’t just things humans think are bad. They’re general obstacles to achieving any real-world goal, and ‘correct reasoning’ is an attractor for systems (e.g., self-improving humans, institutions, or AIs) that can alter their own ability to achieve such goals. If a moderately intelligent self-modifying program lacks the goal ‘generally avoid confirmation bias’ or ‘generally avoid acquiring new knowledge when it would put my life at risk,’ it will add that goal (or something tantamount to it) to its goal set, because it’s instrumental to almost any other goal it might have started with.

On the other hand, if a moderately intelligent self-modifying AI lacks the goal ‘always and forever do exactly what my programmer would ideally wish,’ the number of goals for which it’s instrumental to add that goal to the set is very small, relative to the space of all possible goals. This is why MIRI is worried about AGI; ‘defer to my programmer’ doesn’t appear to be an attractor goal in the way ‘improve my processor speed’ and ‘avoid jumping off cliffs’ are attractor goals. A system that appears amazingly ‘well-designed’ (because it keeps hitting goal after goal of the latter sort) may be poorly-designed to achieve any complicated outcome that isn’t an instrumental attractor, including safety protocols. This is the basis for disaster scenarios like Bostrom on AI deception.

That doesn’t mean that ‘defer to my programmer’ is an impossible goal. It’s just something we have to do the hard work of figuring out ourselves; we can’t delegate the entire task to the AI. It’s a mathematical open problem to define a way for adaptive autonomous AI with otherwise imperfect motivations to defer to programmer oversight and not look for loopholes in its restrictions. People at MIRI and FHI have been thinking about this issue for the past few years; there’s not much published about the topic, though I notice Yudkowsky mentions issues in this neighborhood off-hand in a 2008 blog post about morality.

 

Do what I mean by ‘do what I mean’!

Loosemore doesn’t discuss in any technical detail how an AI could come to improve its goals over time, but one candidate formalism is Daniel Dewey’s value learning. Following Dewey’s work, Bostrom notes that this general approach (‘outsource some of the problem to the AI’s problem-solving ability’) is promising, but needs much more fleshing out. Bostrom discusses some potential obstacles to value learning in his new book Superintelligence (pp. 192-201):

[T]he difficulty is not so much how to ensure that the AI can understand human intentions. A superintelligence should easily develop such understanding. Rather, the difficulty is ensuring that the AI will be motivated to pursue the described values in the way we intended. This is not guaranteed by the AI’s ability to understand our intentions: an AI could know exactly what we meant and yet be indifferent to that interpretation of our words (being motivated instead by some other interpretation of the words or being indifferent to our words altogether).

The difficulty is compounded by the desideratum that, for reasons of safety, the correct motivation should ideally be installed in the seed AI before it becomes capable of fully representing human concepts or understanding human intentions.

We do not know how to build a general intelligence whose goals are a stable function of human brain states, or patterns of ink on paper, or any other encoding of our preferences. Moreover, merely making the AGI’s goals a function of brain states or ink marks doesn’t help if we make it the wrong function. If the AGI starts off with the wrong function, there’s no reason to expect it to self-correct in the direction of the right one, because (a) having the right function is a prerequisite for caring about self-modifying toward the relevant kind of ‘rightness,’ and (b) having goals that are an ersatz function of human brain-states or ink marks seems consistent with being superintelligent (e.g., with having veridical world-models).

When Loosemore’s hypothetical programmer attempts to argue her AI into friendliness, the AI replies, “I don’t care, because I have come to a conclusion, and my conclusions are correct because of the Doctrine of Logical Infallibility.” MIRI and FHI’s view is that the AI’s actual reply (assuming it had some reason to reply, and to be honest) would invoke something more like “the Doctrine of Not-All-Children-Assigning-Infinite-Value-To-Obeying-Their-Parents.” The task ‘across arbitrary domains, get an AI-in-progress to defer to its programmers when its programmers dislike what it’s doing’ is poorly understood, and looks extremely difficult. Getting a corrigible AI of that sort to ‘learn’ the right values is a second large problem. Loosemore seems to treat corrigibility as trivial, and to equate corrigibility with all other AGI goal content problems.

A random AGI self-modifying to improve its own efficiency wouldn’t automatically self-modify to acquire the values of its creators. We have to actually do the work of coding the AI to have a safe decision-making subsystem. Loosemore is right that it’s desirable for the AI to incrementally learn over time what its values are, so we can make some use of its intelligence to solve the problem; but raw intelligence on its own isn’t the solution, since we need to do the work of actually coding the AI to value executing the desired interpretation of our instructions.

“Correct interpretation” and “instructions” are both monstrously difficult to turn into lines of code. And, crucially, we can’t pass the buck to the superintelligence here. If you can teach an AI to “do what I mean,” you can proceed to teach it anything else; but if you can’t teach it to “do what I mean,” you can’t get the bootstrapping started. In particular, it’s a pretty sure bet you also can’t teach it “do what I mean by ‘do what I mean'”.

Unless you can teach it to do what you mean, teaching it to understand what you mean won’t help. Even teaching an AI to “do what you believe I mean” assumes that we can turn the complex concept “mean” into code.

 

Loose ends

I’ll run more quickly through some other points Loosemore makes:

a. He criticizes Legg and Hutter’s definition of ‘intelligence,’ arguing that it trivially applies to an unfriendly AI that self-destructs. However, Legg and Hutter’s definition seems to (correctly) exclude agents that self-destruct. On the face of it, Loosemore should be criticizing MIRI for positing an unintelligent AGI, not for positing a trivially intelligent AGI. For a fuller discussion, see Legg and Hutter’s “A Collection of Definitions of Intelligence“.

b. He argues that safe AGI would be “swarm-like,” with elements that are “unpredictably dependent” on non-representational “internal machinery,” because “logic-based AI” is “brittle”. This seems to contradict the views of many specialists in present-day high-assurance AI systems. As Gerwin Klein writes, “everything that makes it easier for humans to think about a system, will help to verify it.” Indiscriminately adding uncertainty or randomness or complexity to a system makes it harder to model the system and check that it has required properties. It may be less “brittle” in some respects, but we have no particular reason to expect safety to be one of those respects. For a fuller discussion, see Muehlhauser’s “Transparency in Safety-Critical Systems“.

c. MIRI thinks we should try to understand safety-critical general reasoning systems as far in advance as possible, and mathematical logic and rational agent models happen to be useful tools on that front. However, MIRI isn’t invested in “logical AI” in the manner of Good Old-Fashioned AI. Yudkowsky and other MIRI researchers are happy to use neural networks when they’re useful for solving a given problem, and equally happy to use other tools for problems neural networks aren’t well-suited to. For a fuller discussion, see Yudkowsky’s “The Nature of Logic” and “Logical or Connectionist AI?

d. One undercurrent of Loosemore’s article is that we should model AI after humans. MIRI and FHI worry that this would be very unsafe if it led to neuromorphic AI. On the other hand, modeling AI very closely after human brains (approaching the fidelity of whole-brain emulation) might well be a safer option than de novo AI. For a fuller discussion, see Bostrom’s Superintelligence.

On the whole, Loosemore’s article doesn’t engage much with the arguments of other AI theorists regarding risks from AGI.

Is ‘consciousness’ simple? Is it ancient?

 

Assigning less than 5% probability to ‘cows are moral patients’ strikes me as really overconfident. Ditto, assigning greater than 95% probability. (A moral patient is something that can be harmed or benefited in morally important ways, though it may not be accountable for its actions in the way a moral agent is.)

I’m curious how confident others are, and I’m curious about the most extreme confidence levels they’d consider ‘reasonable’.

I also want to hear more about what theories and backgrounds inform people’s views. I’ve seen some relatively extreme views defended recently, and the guiding intuitions seem to have come from two sources:


 

(1) How complicated is consciousness? In the space of possible minds, how narrow a target is consciousness?

Humans seem to be able to have very diverse experiences — dreams, orgasms, drug-induced states — that they can remember in some detail, and at least appear to be conscious during. That’s some evidence that consciousness is robust to modification and can take many forms. So, perhaps, we can expect a broad spectrum of animals to be conscious.

But what would our experience look like if it were fragile and easily disrupted? There would probably still be edge cases. And, from inside our heads, it would look like we had amazingly varied possibilities for experience — because we couldn’t use anything but our own experience as a baseline. It certainly doesn’t look like a human brain on LSD differs as much from a normal human brain as a turkey brain differs from a human brain.

There’s some risk that we’re overestimating how robust consciousness is, because when we stumble on one of the many ways to make a human brain unconscious, we (for obvious reasons) don’t notice it as much. Drastic changes in unconscious neurochemistry interest us a lot less than minor tweaks to conscious neurochemistry.

And there’s a further risk that we’ll underestimate the complexity of consciousness because we’re overly inclined to trust our introspection and to take our experience at face value. Even if our introspection is reliable in some domains, it has no access to most of the necessary conditions for experience. So long as they lie outside our awareness, we’re likely to underestimate how parochial and contingent our consciousness is.


 

(2) How quick are you to infer consciousness from ‘intelligent’ behavior?

People are pretty quick to anthropomorphize superficially human behaviors, and our use of mental / intentional language doesn’t clearly distinguish between phenomenal consciousness and behavioral intelligence. But if you work on AI, and have an intuition that a huge variety of systems can act ‘intelligently’, you may doubt that the linkage between human-style consciousness and intelligence is all that strong. If you think it’s easy to build a robot that passes various Turing tests without having full-fledged first-person experience, you’ll also probably (for much the same reason) expect a lot of non-human species to arrive at strategies for intelligently planning, generalizing, exploring, etc. without invoking consciousness. (Especially if your answer to question 1 is ‘consciousness is very complex’. Evolution won’t put in the effort to make a brain conscious unless it’s extremely necessary for some reproductive advantage.)

… But presumably there’s some intelligent behavior that was easier for a more-conscious brain than for a less-conscious one — at least in our evolutionary lineage, if not in all possible lineages that reproduce our level of intelligence. We don’t know what cognitive tasks forced our ancestors to evolve-toward-consciousness-or-perish. At the outset, there’s no special reason to expect that task to be one that only arose for proto-humans in the last few million years.

Even if we accept that the machinery underlying human consciousness is very complex, that complex machinery could just as easily have evolved hundreds of millions of years ago, rather than tens of millions. We’d then expect it to be preserved in many nonhuman lineages, not just in humans. Since consciousness-of-pain is mostly what matters for animal welfare (not, e.g., consciousness-of-complicated-social-abstractions), we should look into hypotheses like:

first-person consciousness is an adaptation that allowed early brains to represent simple policies/strategies and visualize plan-contingent sensory experiences.

Do we have a specific cognitive reason to think that something about ‘having a point of view’ is much more evolutionarily necessary for human-style language or theory of mind than for mentally comparing action sequences or anticipating/hypothesizing future pain? If not, the data of ethology plus ‘consciousness is complicated’ gives us little reason to favor the one view over the other.

We have relatively direct positive data showing we’re conscious, but we have no negative data showing that, e.g., salmon aren’t conscious. It’s not as though we’d expect them to start talking or building skyscrapers if they were capable of experiencing suffering — at least, any theory that predicts as much has some work to do to explain the connection. At present, it’s far from obvious that the world would look any different than it does even if all vertebrates were conscious.

So… the arguments are a mess, and I honestly have no idea whether cows can suffer. The probability seems large enough to justify ‘don’t torture cows (including via factory farms)’, but that’s a pretty low bar, and doesn’t narrow the probability down much.

To the extent I currently have a favorite position, it’s something like: ‘I’m pretty sure cows are unconscious on any simple, strict, nondisjunctive definition of “consciousness;” but what humans care about is complicated, and I wouldn’t be surprised if a lot of ‘unconscious’ information-processing systems end up being counted as ‘moral patients’ by a more enlightened age. … But that’s a pretty weird view of mine, and perhaps deserves a separate discussion.

I could conclude with some crazy video of a corvid solving a rubik’s cube or an octopus breaking into a bank vault or something, but I somehow find this example of dog problem-solving more compelling:

Bostrom on AI deception

Oxford philosopher Nick Bostrom has argued, in “The Superintelligent Will,” that advanced AIs are likely to diverge in their terminal goals (i.e., their ultimate decision-making criteria), but converge in some of their instrumental goals (i.e., the policies and plans they expect to indirectly further their terminal goals). An arbitrary superintelligent AI would be mostly unpredictable, except to the extent that nearly all plans call for similar resources or similar strategies. The latter exception may make it possible for us to do some long-term planning for future artificial agents.

Bostrom calls the idea that AIs can have virtually any goal the orthogonality thesis, and he calls the idea that there are attractor strategies shared by almost any goal-driven system (e.g., self-preservation, knowledge acquisition) the instrumental convergence thesis.

Bostrom fleshes out his worries about smarter-than-human AI in the book Superintelligence: Paths, Dangers, Strategies, which came out in the US a few days ago. He says much more there about the special technical and strategic challenges involved in general AI. Here’s one of the many scenarios he discusses, excerpted:

[T]he orthogonality thesis suggests that we cannot blithely assume that a superintelligence will necessarily share any of the final values stereotypically associated with wisdom and intellectual development in humans — scientific curiosity, benevolent concern for others, spiritual enlightenment and contemplation, renunciation of material acquisitiveness, a taste for refined culture or for the simple pleasures in life, humility and selflessness, and so forth. We will consider later whether it might be possible through deliberate effort to construct a superintelligence that values such things, or to build one that values human welfare, moral goodness, or any other complex purpose its designers might want it to serve. But it is no less possible — and in fact technically a lot easier — to build a superintelligence that places final value on nothing but calculating the decimal expansion of pi. This suggests that — absent a specific effort — the first superintelligence may have some such random or reductionistic final goal.

[… T]he instrumental convergence thesis entails that we cannot blithely assume that a superintelligence with the final goal of calculating the decimals of pi (or making paperclips, or counting grains of sand) would limit its activities in such a way as not to infringe on human interests. An agent with such a final goal would have a convergent instrumental reason, in many situations, to acquire an unlimited amount of physical resources and, if possible, to eliminate potential threats to itself and its goal system. Human beings might constitute potential threats; they certainly constitute physical resources. […]

It might seem incredible that a project would build or release an AI into the world without having strong grounds for trusting that the system will not cause an existential catastrophe. It might also seem incredible, even if one project were so reckless, that wider society would not shut it down before it (or the AI it was building) attains a decisive strategic advantage. But as we shall see, this is a road with many hazards. […]

With the help of the concept of convergent instrumental value, we can see the flaw in one idea for how to ensure superintelligence safety. The idea is that we validate the safety of a superintelligent AI empirically by observing its behavior while it is in a controlled, limited environment (a “sandbox”) and that we only let the AI out of the box if we see it behaving in a friendly, cooperative, responsible manner.

The flaw in this idea is that behaving nicely while in the box is a convergent instrumental goal for friendly and unfriendly AIs alike. An unfriendly AI of sufficient intelligence realizes that its unfriendly final goals will be best realized if it behaves in a friendly manner initially, so that it will be let out of the box. It will only start behaving in a way that reveals its unfriendly nature when it no longer matters whether we find out; that is, when the AI is strong enough that human opposition is ineffectual.

Consider also a related set of approaches that rely on regulating the rate of intelligence gain in a seed AI by subjecting it to various kinds of intelligence tests or by having the AI report to its programmers on its rate of progress. At some point, an unfriendly AI may become smart enough to realize that it is better off concealing some of its capability gains. It may underreport on its progress and deliberately flunk some of the harder tests, in order to avoid causing alarm before it has grown strong enough to attain a decisive strategic advantage. The programmers may try to guard against this possibility by secretly monitoring the AI’s source code and the internal workings of its mind; but a smart-enough AI would realize that it might be under surveillance and adjust its thinking accordingly. The AI might find subtle ways of concealing its true capabilities and its incriminating intent. (Devising clever escape plans might, incidentally, also be a convergent strategy for many types of friendly AI, especially as they mature and gain confidence in their own judgments and capabilities. A system motivated to promote our interests might be making a mistake if it allowed us to shut it down or to construct another, potentially unfriendly AI.)

We can thus perceive a general failure mode, wherein the good behavioral track record of a system in its juvenile stages fails utterly to predict its behavior at a more mature stage. Now, one might think that the reasoning described above is so obvious that no credible project to develop artificial general intelligence could possibly overlook it. But one should not be too overconfident that this is so.

Consider the following scenario. Over the coming years and decades, AI systems become gradually more capable and as a consequence find increasing real-world application: they might be used to operate trains, cars, industrial and household robots, and autonomous military vehicles. We may suppose that this automation for the most part has the desired effects, but that the success is punctuated by occasional mishaps — a driverless truck crashes into oncoming traffic, a military drone fires at innocent civilians. Investigations reveal the incidents to have been caused by judgment errors by the controlling AIs. Public debate ensues. Some call for tighter oversight and regulation, others emphasize the need for research and better-engineered systems — systems that are smarter and have more common sense, and that are less likely to make tragic mistakes. Amidst the din can perhaps also be heard the shrill voices of doomsayers predicting many kinds of ill and impending catastrophe. Yet the momentum is very much with the growing AI and robotics industries. So development continues, and progress is made. As the automated navigation systems of cars become smarter, they suffer fewer accidents; and as military robots achieve more precise targeting, they cause less collateral damage. A broad lesson is inferred from these observations of real-world outcomes: the smarter the AI, the safer it is. It is a lesson based on science, data, and statistics, not armchair philosophizing. Against this backdrop, some group of researchers is beginning to achieve promising results in their work on developing general machine intelligence. The researchers are carefully testing their seed AI in a sandbox environment, and the signs are all good. The AI’s behavior inspires confidence — increasingly so, as its intelligence is gradually increased.

At this point, any remaining Cassandra would have several strikes against her:

A history of alarmists predicting intolerable harm from the growing capabilities of robotic systems and being repeatedly proven wrong. Automation has brought many benefits and has, on the whole, turned out safer than human operation.

ii  A clear empirical trend: the smarter the AI, the safer and more reliable it has been. Surely this bodes well for a project aiming at creating machine intelligence more generally smart than any ever built before — what is more, machine intelligence that can improve itself so that it will become even more reliable.

iii  Large and growing industries with vested interests in robotics and machine intelligence. These fields are widely seen as key to national economic competitiveness and military security. Many prestigious scientists have built their careers laying the groundwork for the present applications and the more advanced systems being planned.

iv  A promising new technique in artificial intelligence, which is tremendously exciting to those who have participated in or followed the research. Although safety issues and ethics are debated, the outcome is preordained. Too much has been invested to pull back now. AI researchers have been working to get to human-level artificial intelligence for the better part of a century: of course there is no real prospect that they will now suddenly stop and throw away all this effort just when it finally is about to bear fruit.

v  The enactment of some safety rituals, whatever helps demonstrate that the participants are ethical and responsible (but nothing that significantly impedes the forward charge).

vi  A careful evaluation of seed AI in a sandbox environment, showing that it is behaving cooperatively and showing good judgment. After some further adjustments, the test results are as good as they could be. It is a green light for the final step . . .

And so we boldly go — into the whirling knives.

We observe here how it could be the case that when dumb, smarter is safe; yet when smart, smarter is more dangerous. There is a kind of pivot point, at which a strategy that has previously worked excellently suddenly starts to backfire.

For more on terminal goal orthogonality, see Stuart Armstrong’s “General Purpose Intelligence“. For more on instrumental goal convergence, see Steve Omohundro’s “Rational Artificial Intelligence for the Greater Good“.

 

Politics is hard mode

Eliezer  Yudkowsky has written a delightful series of posts (originally on the economics blog Overcoming Bias) about why partisan debates are so frequently hostile and unproductive. Particularly incisive is A Fable of Science and Politics.

One of the broader points Eliezer makes is that, while political issues are important, political discussion isn’t the best place to train one’s ability to look at issues objectively and update on new evidence. The way I’d put it is that politics is hard mode; it takes an extraordinary amount of discipline and skill to communicate effectively in partisan clashe.

This jibes with my own experience; I’m much worse at arguing politics than at arguing other things. And psychological studies indicate that politics is hard mode even (or especially!) for political veterans; see Taber & Lodge (2006).

Eliezer’s way of putting the same point is (riffing off of Dune): ‘Politics is the Mind-Killer.’ An excerpt from that blog post:

Politics is an extension of war by other means. Arguments are soldiers. Once you know which side you’re on, you must support all arguments of that side, and attack all arguments that appear to favor the enemy side; otherwise it’s like stabbing your soldiers in the back — providing aid and comfort to the enemy. […]

I’m not saying that I think Overcoming Bias should be apolitical, or even that we should adopt Wikipedia’s ideal of the Neutral Point of View. But try to resist getting in those good, solid digs if you can possibly avoid it. If your topic legitimately relates to attempts to ban evolution in school curricula, then go ahead and talk about it — but don’t blame it explicitly on the whole Republican Party; some of your readers may be Republicans, and they may feel that the problem is a few rogues, not the entire party. As with Wikipedia’s NPOV, it doesn’t matter whether (you think) the Republican Party really is at fault. It’s just better for the spiritual growth of the community to discuss the issue without invoking color politics.

Scott Alexander fleshes out why it can be dialogue-killing to attack big groups (even when the attack is accurate) in another blog post, Weak Men Are Superweapons. And Eliezer expands on his view of partisanship in follow-up posts like The Robbers Cave Experiment and Hug the Query.

bluegreen

Some people involved in political advocacy and activism have objected to the “mind-killer” framing. Miri Mogilevsky of Brute Reason explained on Facebook:

My usual first objection is that it seems odd to single politics out as a “mind-killer” when there’s plenty of evidence that tribalism happens everywhere. Recently, there has been a whole kerfuffle within the field of psychology about replication of studies. Of course, some key studies have failed to replicate, leading to accusations of “bullying” and “witch-hunts” and what have you. Some of the people involved have since walked their language back, but it was still a rather concerning demonstration of mind-killing in action. People took “sides,” people became upset at people based on their “sides” rather than their actual opinions or behavior, and so on.

Unless this article refers specifically to electoral politics and Democrats and Republicans and things (not clear from the wording), “politics” is such a frightfully broad category of human experience that writing it off entirely as a mind-killer that cannot be discussed or else all rationality flies out the window effectively prohibits a large number of important issues from being discussed, by the very people who can, in theory, be counted upon to discuss them better than most. Is it “politics” for me to talk about my experience as a woman in gatherings that are predominantly composed of men? Many would say it is. But I’m sure that these groups of men stand to gain from hearing about my experiences, since some of them are concerned that so few women attend their events.

In this article, Eliezer notes, “Politics is an important domain to which we should individually apply our rationality — but it’s a terrible domain in which to learn rationality, or discuss rationality, unless all the discussants are already rational.” But that means that we all have to individually, privately apply rationality to politics without consulting anyone who can help us do this well. After all, there is no such thing as a discussant who is “rational”; there is a reason the website is called “Less Wrong” rather than “Not At All Wrong” or “Always 100% Right.” Assuming that we are all trying to be more rational, there is nobody better to discuss politics with than each other.

The rest of my objection to this meme has little to do with this article, which I think raises lots of great points, and more to do with the response that I’ve seen to it — an eye-rolling, condescending dismissal of politics itself and of anyone who cares about it. Of course, I’m totally fine if a given person isn’t interested in politics and doesn’t want to discuss it, but then they should say, “I’m not interested in this and would rather not discuss it,” or “I don’t think I can be rational in this discussion so I’d rather avoid it,” rather than sneeringly reminding me “You know, politics is the mind-killer,” as though I am an errant child. I’m well-aware of the dangers of politics to good thinking. I am also aware of the benefits of good thinking to politics. So I’ve decided to accept the risk and to try to apply good thinking there. […]

I’m sure there are also people who disagree with the article itself, but I don’t think I know those people personally. And to add a political dimension (heh), it’s relevant that most non-LW people (like me) initially encounter “politics is the mind-killer” being thrown out in comment threads, not through reading the original article. My opinion of the concept improved a lot once I read the article.

In the same thread, Andrew Mahone added, “Using it in that sneering way, Miri, seems just like a faux-rationalist version of ‘Oh, I don’t bother with politics.’ It’s just another way of looking down on any concerns larger than oneself as somehow dirty, only now, you know, rationalist dirty.” To which Miri replied: “Yeah, and what’s weird is that that really doesn’t seem to be Eliezer’s intent, judging by the eponymous article.”

Eliezer clarified that by “politics” he doesn’t generally mean ‘problems that can be directly addressed in local groups but happen to be politically charged':

Hanson’s “Tug the Rope Sideways” principle, combined with the fact that large communities are hard to personally influence, explains a lot in practice about what I find suspicious about someone who claims that conventional national politics are the top priority to discuss. Obviously local community matters are exempt from that critique! I think if I’d substituted ‘national politics as seen on TV’ in a lot of the cases where I said ‘politics’ it would have more precisely conveyed what I was trying to say.

Even if polarized local politics is more instrumentally tractable, though, the worry remains that it’s a poor epistemic training ground. A subtler problem with banning “political” discussions on a blog or at a meet-up is that it’s hard to do fairly, because our snap judgments about what counts as “political” may themselves be affected by partisan divides. In many cases the status quo is thought of as apolitical,  even though objections to the status quo are ‘political.’ (Shades of Pretending to be Wise.)

Because politics gets personal fast, it’s hard to talk about it successfully. But if you’re trying to build a community, build friendships, or build a movement, you can’t outlaw everything ‘personal.’ And selectively outlawing personal stuff gets even messier. Last year, daenerys shared anonymized stories from women, including several that discussed past experiences where the writer had been attacked or made to feel unsafe. If those discussions are made off-limits because they’re ‘political,’ people may take away the message that they aren’t allowed to talk about, e.g., some harmful or alienating norm they see at meet-ups. I haven’t seen enough discussions of this failure mode to feel super confident people know how to avoid it.

Since this is one of the LessWrong memes that’s most likely to pop up in discussions between different online communities (along with the even more ripe-for-misinterpretation “policy debates should not appear one-sided“…), as a first (very small) step, I suggest obsoleting the ‘mind-killer’ framing. It’s cute, but ‘politics is hard mode’ works better as a meme to interject into random conversations. ∵:

1. ‘Politics is hard mode’ emphasizes that ‘mind-killing’ (= epistemic difficulty) is quantitative, not qualitative. Some things might instead fall under Very Hard Mode, or under Middlingly Hard Mode…

2. ‘Hard’ invites the question ‘hard for whom?’, more so than ‘mind-killer’ does. We’re all familiar with the fact that some people and some contexts change what’s ‘hard’, so it’s a little less likely we’ll universally generalize about what’s ‘hard.’

3. ‘Mindkill’ connotes contamination, sickness, failure, weakness. ‘Hard Mode’ doesn’t imply that a thing is low-status or unworthy, so it’s less likely to create the impression (or reality) that LessWrongers or Effective Altruists dismiss out-of-hand the idea of hypothetical-political-intervention-that-isn’t-a-terrible-idea.  Maybe some people do want to argue for the thesis that politics is always useless or icky, but if so it should be done in those terms, explicitly — not snuck in as a connotation.

4. ‘Hard Mode’ can’t readily be perceived as a personal attack. If you accuse someone of being ‘mindkilled’, with no context provided, that clearly smacks of insult — you appear to be calling them stupid, irrational, deluded, or similar. If you tell someone they’re playing on ‘Hard Mode,’ that’s very nearly a compliment, which makes your advice that they change behaviors a lot likelier to go over well.

5. ‘Hard Mode’ doesn’t carry any risk of evoking (e.g., gendered) stereotypes about political activists being dumb or irrational or overemotional.

6. ‘Hard Mode’ encourages a growth mindset. Maybe some topics are too hard to ever be discussed. Even so, ranking topics by difficulty still encourages an approach where you try to do better, rather than merely withdrawing. It may be wise to eschew politics, but we should not fear it. (Fear is the mind-killer.)

If you and your co-conversationalists haven’t yet built up a lot of trust and rapport, or if tempers are already flaring, conveying the message ‘I’m too rational to discuss politics’ or ‘You’re too irrational to discuss politics’ can make things worse.  ‘Politics is the mind-killer’ is the mind-killer. At least, it’s a relatively mind-killing way of warning people about epistemic hazards.

‘Hard Mode’ lets you communicate in the style of the Humble Aspirant rather than the Aloof Superior. Try something in the spirit of: ‘I’m worried I’m too low-level to participate in this discussion; could you have it somewhere else?’ Or: ‘Could we talk about something closer to Easy Mode, so we can level up together?’ If you’re worried that what you talk about will impact group epistemology, I think you should be even more worried about how you talk about it.

Cards Against Humanity against humanity

Content note: anti-LGBT sentiment, antisemitism, racism, sexual assault

Cards Against Humanity is a card game where people combine terms into new phrases in pursuit of dark and edgy mirth and pith. Like Apples to Apples, but focused on all things political, absurdist, and emotionally charged. A lot of progressives like the game, so it’s a useful place to start talking about the progressive tug-of-war between ‘expand the universe of socially accepted speech‘ and ‘make harmful and oppressive speech less socially acceptable‘.

It’s recently come to people’s attention that Max Temkin, CAH co-creator and former Obama campaign staffer, removed the card ‘passable transvestites’ from the game a while ago, calling it “a mean, cheap joke“. Likewise the cards ‘date rape’ and ‘roofies’. That prompted Chris Hallquist to accuse CAH and its progressive fans of hypocrisy:

 

Since moving to the Bay Area, I’ve twice been involved in conversations where someone has suggested that some of the cards in Cards Against Humanity are really offensive and need to be removed from the deck.

To which I say: huh?

Not that some of the cards aren’t offensive—they are. I love the game in spite of this fact, but I totally understand if some people aren’t in to the game’s brand of humor and don’t want to play. What baffles me is the suggestion that it’s just some of the cards, and if you removed them the game would be fine.

[… H]ere are some of the cards from the very first twenty-card sheet found in the free PDF:

  • Not giving a shit about the Third World
  • A windmill full of corpses
  • Bingeing and purging
  • The hardworking Mexican (subtly suggests most Mexicans are lazy)
  • The gays (which I’m pretty sure is not how people who are sensitive to LBGT issues refer to refer to gay people)

If you’re going to remove the offensive cards from the Cards Against Humanity deck, you’re easily removing 30% or more of the deck. And there are lots of cards that may not be offensive at first glance, but are clearly designed to be combined with other cards in offensive ways. For example, the “African Children” card (also on the first page of the PDF) only sounds innocuous if you’ve never actually played the game before. It suddenly becomes very offensive if someone plays it in response to the question “How did I lose my virginity?”

[… N]ow, I understand that some rape victims have PTSD triggers around discussion of rape, and I can understand someone in that position saying, “I enjoy lots of offensive humor, but jokes about rape are something I, personally, can’t handle.” I certainly wouldn’t recommend telling rape jokes to random strangers you meet on the street.

But when people complain about rape jokes, that’s rarely all they’re saying. Instead, the line is “rape jokes are never okay,” which I find a little hard to accept, especially when the context is Cards Against Humanity. Like, do they really think rape jokes are inherently morally superior to jokes about AIDS and the Holocaust?

Let me make a proposal: if you’ve enjoyed playing Cards Against Humanity (and haven’t repented of your offensive humor enjoying ways and sworn never to play the game again), you really have no business moralizing about what kinds of humor other people enjoy.

 

If Chris is accurately picking out his friends’ core objections, that’s a fair rebuttal. But I don’t think liberal ambivalence about CAH is generally shaped like that—even when words like ‘offensive’ show up. (Though especially when critics are wise enough to forsake that word.)

Chris’ target bears a close resemblance to some standard misunderstandings of social justice writers’ views:

_____________________________________________

Straw claim #1: ‘Offensive = bad.’

Ordinary claim: Harmful = bad.

Sometimes offensive things are harmful. And sometimes they’re harmful specifically because of the extreme ways they cause people offense. But offensiveness in itself is fine. Heck, it can be a positive thing if it leads to harmless fun or consciousness-raising.

_____________________________________________

Straw claim #2: ‘Making jokes about oppressed groups is intrinsically bad, for Reasons. Deep mysterious ineffable ones.’

Ordinary claim: Making jokes about oppressed groups can be bad, if it’s used to harm the group. That’s for common-sense consequentialist reasons.

If your friend Bob just went through a horrible break-up, joking about it might cause him a lot of pain, as opposed to helping lighten the mood. If you care about your friend’s feelings, you should be careful in that context to make break-up jokes ones he’ll find funny, and not ones that are at his expense. Follow his lead, and be sensitive to context.

And don’t assume you can make dickish jokes at his expense just because he’s not in the room at the moment. If you wouldn’t want to say it to his face, think twice about saying it behind his back.

Now, if that’s true for Bob, it should also hold for a whole group of Bobs who collectively went through a Mass Societal Horrible Break-Up. As for individuals, so for groups.

_____________________________________________

Straw claim #3: ‘Making jokes about offensive topics like rape is always wrong.’

Ordinary claim: It’s hard to make a good rape joke. So, for most people, it’s probably a good heuristic to avoid even trying. But feminists recognize that there can be good rape jokes — both “good” in that they aren’t even mildly immoral, and “good” in that they aren’t shitty jokes.

People who complain about harmful jokes are often accused of being humorless killjoys. Which I’ve always found weird. Consider Hurley and Dennett’s new theory of humor in their book Inside Jokes, which can be summarized in five core claims about the psychology of humor. Their fifth claim is that when an incongruous discovery is funny, “the discovery is not accompanied by any (strong) negative emotional valence”.

To the extent a joke makes you feel really bad, you miss out on experiencing amusement from it. You aren’t just hurting people; you’re also artificially narrowing the audience that can appreciate your jokes, and not because they lack a sense of humor. How is telling inaccessibly painful jokes, and thereby making it impossible for a large swathe of the population to get any enjoyment out of one’s material, not being a ‘killjoy’?

_____________________________________________

Straw claim #4: ‘Context doesn’t matter. It doesn’t matter who’s telling the joke, or to whom, or to what effect. Wrong is wrong is wrong.’

Ordinary claim: …?? Huh? Of course context matters.

A group of Holocaust survivors telling zany jokes about the Holocaust can be totally different from a group of neo-nazis telling the exact same jokes. One is a heck of a lot more likely to be tongue-in-cheek (or outright cathartic), and to neither reflect nor reinforce real hate. And there’s continuous variation between those two extremes.

For example, anti-semitism is sufficiently stigmatized in my Super-Liberal Inner Circle of Friends that I, a Jew, feel perfectly comfortable hearing them tell just about any Holocaust joke at a get-together. But put the exact same jokes in the mouths of a group of Midwestern frat boys who I don’t know very well and who have been teasingly calling me ‘Christ-killer’ and shooting me dirty looks, and I will feel uncomfortable, and alienated, and I won’t have a good time.

There are lots of ways to have a good time that aren’t parasitic on others’ good time. Do those things instead.

When I haven’t personally interacted much with a group, I’m forced to fall back on base rates. The base rates for sexual harassment and antisemitism in my community simultaneously inform how likely it is that I personally have been affected by those things, and how confident I can be that a given community member deserves my trust. Those are the guiding stars for responsible-but-hilarious comedians like Louis C.K.. The Raw Badness of real-world rape v. the Raw Badness of real-world ethnic cleansing is a lot less directly relevant.

_____________________________________________

It isn’t hypocritical to endorse harmless black comedy while criticizing harmful black comedy.

There are some bona fide straw feminists out there. (Straw men are very often weak men.) But off-the-cuff rhetoric isn’t necessarily a good indicator of that, and the more serious position is the one that deserves debate.

I won’t argue here that liberals are being morally consistent if they reject ‘date rape’ and ‘passable transvestites’ while embracing the rest of the CAH deck. But I do want the case for hypocrisy to be made by citing the actual views of typical social justice thinkers. And I want the discussion of this to give people practice at being better nuanced consequentialists. And, perhaps, better friends and entertainers.

 

Cards Against Humanity. Jews. Prue thru Ar. : ivy

 

A card like ‘date rape’ is likely to cause excessive harm because it makes light of something that needs to be taken more seriously, here and now, by the culture CAH is being played in. And, of course, because a lot of people who play CAH have been raped. Is it hypocritical to let ‘a windmill full of corpses’ slide, while criticizing ‘date rape’? Not if your concerns with ‘date rape’ are consequentialist SJ-type arguments, as opposed to some more conservative appeal to propriety.

Likewise, anti-trans sentiment is a lot more prevalent and acutely harmful these days than anti-gay sentiment. With each passing day, using ‘gay’ as a slur, or treating gay people as weirdo Others, is causing the collective eyebrows of mainstream audiences to raise ever higher. Meanwhile, using slurs like ‘tranny’ or ‘transvestite’ continues to be seen as normal and acceptable even by many liberal audiences (e.g., the sort of people who might watch The Daily Show). Making fun of gay people for being weird still happens, but that no longer dominates their depiction in the way it continues to dominate media portrayals of cross-dressers.

So, although a case could be made that ‘the gays’ and ‘passable transvestites’ are equally harmful, it really isn’t self-evident. The humor in ‘the gays’ involves more of a friendly wink to gay people. Much of the joke is directed at conservatives who treat gay people as an alien monolith. Hence the ‘the’. In contrast, the humor in ‘passable transvestites’ seems to mostly depend on shock value. And what’s shocking is cross-dressers themselves—the tacit assumption being that their existence is freakish and surprising and strange. In a word, comical.

These arguments are complicated. We can’t just ask ‘is the card kind of racey and edgy?’ and call off any further moral evaluation once we have a ‘yes’ or ‘no’ answer. To partly side with Chris, I think good arguments can be given that ‘bingeing and purging’ and ‘two midgets shitting into a bucket’ are questionable cards in most CAH groups, because the joke is at the expense of people who really are widely stigmatized and othered even by the liberalest of liberals. It’s common knowledge in most friend circles that ‘racism’ and ‘not giving a shit about the Third World’ are Bad. It’s when things fall in the uncanny valley between Totally Normal and Totally Beyond The Pale that you need to put some thought into whether you’re having your fun at a significant cost to others. (And, yes, the answer will frequently vary based on who’s playing the game.)

Being a good person is about considering the harm your actions might have, which means being sensitive to how many people are affected, and how strongly. We can’t escape the moral facts’ empirical contingency or quantitativeness. That is, we can’t get by without actually thinking about the people affected, and talking to them to find out how they’re affected. If the name of the game is ‘try to make the world more fun for everyone’, there isn’t any simple algorithm (like ‘it’s never OK to offend people’ or ‘it’s always OK to offend people’) that can do the hard work for us.

Loving the merely physical

This is my submission to Sam Harris’ Moral Landscape challenge: “Anyone who believes that my case for a scientific understanding of morality is mistaken is invited to prove it in under 1,000 words. (You must address the central argument of the book—not peripheral issues.)”

Though I’ve mentioned before that I’m sympathetic to Harris’ argument, I’m not fully persuaded. And there’s a particular side-issue I think he gets wrong straightforwardly enough that it can be demonstrated in the space of 1,000 words: really unrequitable love, or the restriction of human value to conscious states.

____________________________________________________

My criticism of Harris’ thesis will be indirect, because it appears to me that his proposal is much weaker than his past critics have recognized. What are we to make of a meta-ethics text that sets aside meta-ethicists’ core concerns with a shrug? Harris happily concedes that promoting well-being is only contingently moral,¹ only sometimes tracks our native preferences² or moral intuitions,³ and makes no binding, categorical demand on rational humans.⁴ So it looks like the only claim Harris is making is that redefining words like ‘good’ and ‘ought’ to track psychological well-being would be useful for neuroscience and human cooperation.⁵ Which looks like a question of social engineering, not of moral philosophy.

If Harris’ moral realism sounds more metaphysically audacious than that, I suspect it’s because he worries that putting it in my terms would be uninspiring or, worse, would appear relativistic. (Consistent with my interpretation, he primarily objects to moral anti-realism and relativism for eroding human compassion, not for being false.)⁶

I don’t think I can fairly assess Harris’ pragmatic linguistic proposal in 1,000 words.⁷ But I can point to an empirical failing in a subsidiary view he considers central: that humans only ultimately value changes in conscious experience.⁸

It may be that only conscious beings can value things; but that doesn’t imply that only conscious states can be valued. Consider these three counterexamples:

(1) Natural Diversity. People prize the beauty and complexity of unconscious living things, and of the natural world in general.⁹

Objection: ‘People value those things because they could in principle experience them. “Beauty” is in the beholder’s eye, not in the beheld object. That’s our clue that we only prize natural beauty for making possible our experience of beauty.’

Response: Perhaps our preference here causally depends on our experiences; but that doesn’t mean that we’re deluded in thinking we have such preferences!

I value my friends’ happiness. Causally, that value may be entirely explainable in terms of patterns in my own happiness, but that doesn’t make me an egoist. Harris would agree that others’ happiness can be what I value, even if my own happiness is why I value it. But the same argument holds for natural wonders: I can value them in themselves, even if what’s causing that value is my experiences of them.

(2) Accurate Beliefs. Consider two experientially identical worlds: One where you’re in the Matrix and have systematically false beliefs, one where your beliefs are correct. Most people would choose to live in the latter world over the former, even knowing that it makes no difference to any conscious state.

Objection: ‘People value the truth because it’s usually useful. Your example is too contrived to pump out credible intuitions.’

Response: Humans can mentally represent environmental objects, and thereby ponder, fear, desire, etc. the objects themselves. Fearing failure or death isn’t the same as fearing experiencing failure or death. (I can’t escape failure/death merely by escaping awareness/consciousness of failure/death.) In the same way, valuing being outside the Matrix is distinct from valuing having experiences consistent with being outside the Matrix.

All of this adds up to a pattern that makes it unlikely people are deluded about this preference. Perhaps it’s somehow wrong to care about the Matrix as anything but a possible modifier of experience. But, nonetheless, people do care. Such preferences aren’t impossible or ‘unintelligible.’⁸

(3) Zombie Welfare. Some people don’t think we have conscious states. Harris’ view predicts that such people will have no preferences, since they can’t have preferences concerning experiences. But eliminativists have desires aplenty.

Objection: ‘Eliminativists are deeply confused; it’s not surprising that they have incoherent normative views.’

Response: Eliminativists may be mistaken, but they exist.¹⁰ That suffices to show that humans can care about things they think aren’t conscious. (Including unconscious friends and family!)

Moreover, consciousness is a marvelously confusing topic. We can’t be infinitely confident that we’ll never learn eliminativism is true. And if, pace Descartes, there’s even a sliver of doubt, then we certainly shouldn’t stake the totality of human value on this question.

Harris writes that “questions about values — about meaning, morality, and life’s larger purpose — are really questions about the well-being of conscious creatures. Values, therefore, translate into facts that can be scientifically understood[.]”¹¹ But the premise is much stronger than the conclusion requires.

If people’s acts of valuing are mental, and suffice for deducing every moral fact, then scientifically understanding the mind will allow us to scientifically understand morality even if the objects valued are not all experiential. We can consciously care about unconscious world-states, just as we can consciously believe in, consciously fear, or consciously wonder about unconscious world-states. That means that Harris’ well-being landscape needs to be embedded in a larger ‘preference landscape.’

Perhaps a certain philosophical elegance is lost if we look beyond consciousness. Still, converting our understanding of the mind into a useful and reflectively consistent decision procedure cannot come at the expense of fidelity to the psychological data. Making ethics an empirical science shouldn’t require us to make any tenuous claims about human motivation.

We could redefine the moral landscape to exclude desires about natural wonders and zombies. It’s just hard to see why. Harris has otherwise always been happy to widen the definition of ‘moral’ to compass a larger and larger universe of human value. Since we’ve already strayed quite a bit from our folk intuitions about ‘morality,’ it’s honestly not of great importance how we tweak the edges of our new concept of morality. Our first concern should be with arriving at a correct view of human psychology. If that falters, then, to the extent science can “determine human values,” the moral decisions we build atop our psychological understanding will fail us as well.

____________________________________________________

Citations

¹ “Perhaps there is no connection between being good and feeling good — and, therefore, no connection between moral behavior (as generally conceived) and subjective well-being. In this case, rapists, liars, and thieves would experience the same depth of happiness as the saints. This scenario stands the greatest chance of being true, while still seeming quite far-fetched. Neuroimaging work already suggests what has long been obvious through introspection: human cooperation is rewarding. However, if evil turned out to be as reliable a path to happiness as goodness is, my argument about the moral landscape would still stand, as would the likely utility of neuroscience for investigating it. It would no longer be an especially ‘moral’ landscape; rather it would be a continuum of well-being, upon which saints and sinners would occupy equivalent peaks.” -Harris (2010), p. 190

“Dr. Harris explained that about three million Americans are psychopathic. That is to say, they don’t care about the mental states of others. They enjoy inflicting pain on other people. But that implies that there’s a possible world, which we can conceive, in which the continuum of human well-being is not a moral landscape. The peaks of well-being could be occupied by evil people. But that entails that in the actual world, the continuum of well-being and the moral landscape are not identical either. For identity is a necessary relation. There is no possible world in which some entity A is not identical to A. So if there’s any possible world in which A is not identical to B, then it follows that A is not in fact identical to B.” -Craig (2011)

Harris’ (2013a) response to Craig’s argument: “Not a realistic concern. You’d have to change too many things — the world would [be] unrecognizable.”

² “I am not claiming that most of us personally care about the experience of all conscious beings; I am saying that a universe in which all conscious beings suffer the worst possible misery is worse than a universe in which they experience well-being. This is all we need to speak about ‘moral truth’ in the context of science.” -Harris (2010), p. 39

³ “And the fact that millions of people use the term ‘morality’ as a synonym for religious dogmatism, racism, sexism, or other failures of insight and compassion should not oblige us to merely accept their terminology until the end of time.” -Harris (2010), p. 53

“Everyone has an intuitive ‘physics,’ but much of our intuitive physics is wrong (with respect to the goal of describing the behavior of matter). Only physicists have a deep understanding of the laws that govern the behavior of matter in our universe. I am arguing that everyone also has an intuitive ‘morality,’ but much of our intuitive morality is clearly wrong (with respect to the goal of maximizing personal and collective well-being).” -Harris (2010), p. 36

⁴ Moral imperatives as hypothetical imperatives (cf. Foot (1972)): “As Blackford says, when told about the prospect of global well-being, a selfish person can always say, ‘What is that to me?’ [… T]his notion of ‘should,’ with its focus on the burden of persuasion, introduces a false standard for moral truth. Again, consider the concept of health: should we maximize global health? To my ear, this is a strange question. It invites a timorous reply like, ‘Provided we want everyone to be healthy, yes.’ And introducing this note of contingency seems to nudge us from the charmed circle of scientific truth. But why must we frame the matter this way? A world in which global health is maximized would be an objective reality, quite distinct from a world in which we all die early and in agony.” -Harris (2011)

“I don’t think the distinction between morality and something like taste is as clear or as categorical as we might suppose. […] It seems to me that the boundary between mere aesthetics and moral imperative — the difference between not liking Matisse and not liking the Golden Rule — is more a matter of there being higher stakes, and consequences that reach into the lives of others, than of there being distinct classes of facts regarding the nature of human experience.” -Harris (2011)

⁵ “Whether morality becomes a proper branch of science is not really the point. Is economics a true science yet? Judging from recent events, it wouldn’t appear so. Perhaps a deep understanding of economics will always elude us. But does anyone doubt that there are better and worse ways to structure an economy? Would any educated person consider it a form of bigotry to criticize another society’s response to a banking crisis? Imagine how terrifying it would be if great numbers of smart people became convinced that all efforts to prevent a global financial catastrophe must be either equally valid or equally nonsensical in principle. And yet this is precisely where we stand on the most important questions in human life. Currently, most scientists believe that answers to questions of human value will fall perpetually beyond our reach — not because human subjectivity is too difficult to study, or the brain too complex, but because there is no intellectual justification for speaking about right and wrong, or good and evil, across cultures. Many people also believe that nothing much depends on whether we find a universal foundation for morality. It seems to me, however, that in order to fulfill our deepest interests in this life, both personally and collectively, we must first admit that some interests are more defensible than others.” -Harris (2010), p. 190

⁶ “I have heard from literally thousands of highly educated men and women that morality is a myth, that statements about human values are without truth conditions (and are, therefore, nonsensical), and that concepts like well-being and misery are so poorly defined, or so susceptible to personal whim and cultural influence, that it is impossible to know anything about them. Many of these people also claim that a scientific foundation for morality would serve no purpose in any case. They think we can combat human evil all the while knowing that our notions of ‘good’ and ‘evil’ are completely unwarranted. It is always amusing when these same people then hesitate to condemn specific instances of patently abominable behavior. I don’t think one has fully enjoyed the life of the mind until one has seen a celebrated scholar defend the ‘contextual’ legitimacy of the burqa, or of female genital mutilation, a mere thirty seconds after announcing that moral relativism does nothing to diminish a person’s commitment to making the world a better place.” -Harris (2010), p. 27

“I consistently find that people who hold this view [moral anti-realism] are far less clear-eyed and committed than (I believe) they should be when confronted with moral pathologies — especially those of other cultures — precisely because they believe there is no deep sense in which any behavior or system of thought can be considered pathological in the first place. Unless you understand that human health is a domain of genuine truth claims — however difficult ‘health’ may be to define — it is impossible to think clearly about disease. I believe the same can be said about morality. And that is why I wrote a book about it…” -Harris (2011)

⁷ For more on this proposal, see Bensinger (2013).

⁸ “[T]he rightness of an act depends on how it impacts the well-being of conscious creatures[….] Here is my (consequentialist) starting point: all questions of value (right and wrong, good and evil, etc.) depend upon the possibility of experiencing such value. Without potential consequences at the level of experience — happiness, suffering, joy, despair, etc. — all talk of value is empty. Therefore, to say that an act is morally necessary, or evil, or blameless, is to make (tacit) claims about its consequences in the lives of conscious creatures (whether actual or potential).” -Harris (2010), p. 62

“[C]onsciousness is the only intelligible domain of value.” -Harris (2010), p. 32

Harris (2013b) confirms that this is part of his “central argument”.

⁹ “Certain human uses of the natural world — of the non-animal natural world! — are morally troubling. Take an example of an ancient sequoia tree. A thoughtless hiker carves his initials, wantonly, for the fun of it, into an ancient sequoia tree. Isn’t there something wrong with that? It seems to me there is.” -Sandel (2008)

¹⁰ E.g., Rey (1982), Beisecker (2010), and myself. (I don’t assume eliminativism in this essay.)

¹¹ Harris (2010), p. 1.

____________________________________________________

References

Anglish, as it never was but totally should have been

Stay warm, little flappers, and find lots of plant eggs!

By Randall Munroe of xkcd.

Kate Donovan said of the above comic “This is the Robbiest xkcd I’ve seen.”, which is one of my favorite compliments of all time. I love discombobulating words; and recombobulating them; really, bobulating them in all sorts of ways. Though especially in ways that make new poetries possible, or lead to new insights about the world and its value.

I’m very fond of the approach of restricting myself to common words (Up-Goer Five), and of other systematic approaches. But I think my favorite of all is the artificial language Anglish: English using only native roots.

Although English is a Germanic language, only 1/4 of modern English words (that you’ll find in the Shorter Oxford Dictionary) have Germanic roots. The rest mostly come from Latin, either directly or via French. This borrowing hasn’t just expanded our vocabulary; it’s led to the loss of countless native English words which were replaced by synonyms perceived as more formal or precise. Since a lot of these native words are just a joy to say, since their use sheds light on many of English’s vestigial features, and since derivations from English words are often far easier to break down and parse than lengthy classical coinings (e.g., needlefear rather than aichmophobia), Anglo-Saxon linguistic purists are compiling a dictionary to translate non-native words into Germanic equivalents. Some of the more entertaining entries follow.

__________________________________________________________________________

Bizarre and vulgar illustrations from illuminated medieval manuscripts

A and B

  • abduct = neednim
  • abet = frofer
  • abhor = mislike
  • abominable = wargly
  • abortion = scrapping
  • accelerate = swiften
  • accessible = to-goly
  • accident = mishappening
  • accordion = bellowharp
  • active = deedy
  • adherent = clinger, liefard
  • adolescent = halfling, younker, frumbeardling
  • adrenaline = bykidney workstuff
  • adulation = flaundering, glavering
  • adversity = thwartsomeness, hardhap
  • Afghan = Horsemanlandish
  • afraid = afeared
  • Africa = Sunnygreatland
  • aged = oldened
  • agglomerate = clodden
  • aggressive = fighty
  • agitation = fret of mind
  • AIDS = Earned Bodyweir Scantness Sickness
  • airplane = loftcraft
  • albino = whiteling
  • alcoholic = boozen
  • altercation = brangling
  • America = Markland, Amerigoland, Wineland
  • anathema = accursed thing
  • angel = errand-ghost
  • anglicization = englishing
  • anime = dawnlandish livedrawing
  • annihilate = benothingen
  • antecedence = beforemath
  • anthropology = folklore
  • anti- = nomore-
  • antimatter = unstuff
  • antiquity = oldendom
  • antisemitism = jewhate
  • aorta = lofty heartpipe
  • apostle = sendling
  • arithmetic = talecraft
  • arm (v.) = beweapon
  • armadillo = girdledeer
  • arrest = avast
  • artificial = craftly
  • asparagus = sparrowgrass
  • assassinated = deathcrafted
  • assembly = forsamening
  • audacious = daresome, ballsy
  • augment = bemore, eken
  • August = Weedmonth
  • autopsy = open bodychecking
  • avalanche = fellfall
  • avant garde = forhead
  • avert = forfend, forethwart
  • ballet = fairtumb
  • ballistics = shootlore
  • balloon = loftball
  • banana = moonapple, finger-berry
  • banquet = benchsnack
  • barracks = warbarn
  • basketball = cawlball
  • bastard = mingleling, lovechild
  • battlefield = hurlyburlyfield, badewang
  • beau = ladfriend, fop
  • beautiful = eyesome, goodfaced
  • behavioral economics = husbandry of the how
  • Belgium = Belgy
  • bestiality = deerlust
  • betrayer = unfriend, foe-friend, mouth friend
  • bicameral = twifackly
  • bisexuality = twilust
  • blame = forscold
  • blasphemy = godsmear
  • bong = waterpipe
  • bourgeois  = highburger
  • boutique = dressshop
  • braggart = mucklemouth
  • braille = the Blind’s rune
  • brassiere = underbodice
  • bray = heehaw
  • breakable = brittle, brickle, breaksome, bracklesome
  • breeze = windlick
  • buggery = arseswiving
  • burlesque = funnish
  • butter = cowsmear

__________________________________________________________________________

Bizarre and vulgar illustrations from illuminated medieval manuscripts

C and D

  • calculus = flowreckoning
  • campus =  lorefield
  • cancerous = cankersome
  • capacity = holdth
  • capsize = wemmel
  • carbon dioxide = twiathemloft chark, onecoal-twosour-stuff, fizzloft
  • carnal attraction = fleshbesmittenness
  • cartouche = stretched foreverness-rope
  • catechism = godlore handbook
  • caterpillar = Devil’s cat, hairy cat, butterfly worm
  • catheter = bodypipe
  • cattle = kine
  • cause (n.) = bring-about, onlet, wherefrom
  • cell = hole, room, frume, lifebrick
  • cell division = frumecleaving
  • cell membrane = frumenfilmen
  • cement = brickstick
  • cerebellum = brainling
  • certainly = forsooth, soothly, in sooth
  • cerulean = woadish
  • chaos = mayhem, dwolm, topsy-turvydom, unfrith
  • character = selfsuchness
  • charity = givefulness
  • chocolate =  sweetslime
  • circumcise = umcut
  • circumstance = boutstanding, happenstanding
  • civilization = couthdom, settledhood
  • civilize = tame, couthen
  • clamor = greeding
  • clarify = clearen
  • classification = bekinding
  • clavicle = sluttlebone
  • cliche = unthought-up saying, oftquote, hackney
  • clinic = sickbay
  • clockwise = sunwise
  • coffer = hoardvat
  • coitus = swiving, bysleep
  • color = huecast, light wavelength
  • combine = gatherbind
  • comedian = funspeller, lustspeller, laughtersmith
  • comedy = funplay, lustplay
  • comestible = eatsome, a food thing
  • comfort = frover, weem, soothfulness
  • comfortable = weemly, froverly
  • comment = umspeech
  • CD-ROM = WR-ROB (withfasted-ring-read-only bemindings)
  • companion = helpmate
  • comparative anatomy = overlikening bodylore
  • compare = aliken, gainsame liken, game off against
  • complexion = blee, skin-look
  • compliant = followsome
  • composition = nibcraft
  • concentrated = squished together
  • concentration camp = cramming-laystead
  • concentric = middlesharing
  • condition = fettle
  • condom = scumbag
  • conscience = inwit, heart’s eye
  • convergence = togethering
  • convert = bewhirve
  • copious = beteeming
  • corner = nook, winkle
  • correction fluid = white-out
  • corridor = hallway
  • corrugated = wrizzled
  • Costa Rican = Rich Shorelander
  • Cote d’Ivoire = Elfbone Shoreland
  • cotton = treewool
  • coward = dastard, arg
  • creme de la crem = bee’s knees
  • criterion = deemmean
  • cytoskeleton = frumenframework
  • dairy = deyhouse, milkenstuff
  • danger = freech, deathen
  • data = put, rawput, meteworths
  • database = putbank
  • deceive = swike, beswike, fop, wimple
  • defame = shend, befile
  • defeat = netherthrow
  • defenestrate = outwindowse
  • deify = begod
  • delusion = misbelief
  • demeanour = jib
  • demilitarized = unlandmighted
  • dependence = onhanginess
  • descendent = afterbear, afterling
  • despair = wanhope
  • dinosaur = forebird
  • disarrange = rumple
  • disaster = harrow-hap, ill-hap, banefall, baneburst, grimming
  • disinfect = unsmittle
  • disprove = belie
  • disturbance = dreefing, dreep-hap
  • divination = weedgle
  • division = tweeming

__________________________________________________________________________

Bizarre and vulgar illustrations from illuminated medieval manuscripts

Um, all the other ones

  • ease (n.) = eath, frith of mind
  • egalitarianism = evendom
  • electricity = sparkflow, ghostfire
  • electron = amberling
  • elevate = aloofen
  • embryo = womb-berry
  • enable = canen, mayen
  • enact = umdo, emdo
  • encryption = forkeying
  • energy = dodrive, inwork, spring
  • ensnare = swarl
  • enthusiasm = faith-heat
  • environment = lifescape, setting, umwhirft
  • enzyme = yeaster, yeastuff
  • ephemeral = dwimmerly
  • equation = likening, besaming
  • ethnic minority = outlandish fellowship
  • evaluate = bedeem, bereckon, beworthen
  • example = bisen, byspell, lodestar, forbus
  • exaptation = kludging
  • existent = wesand, forelying, issome
  • face = nebb, andlit, leer, hue, blee, mug
  • fair (n.) = hoppings
  • female = she-kind
  • fetid = flyblown, smellful, stenchy
  • figment = farfetchery
  • fornication = whorery, awhoring
  • fray = frazzle
  • fugitive = lamster, flightling
  • gas-powered = waftle-driven
  • gland = threeze
  • history = yorelore, olds, eretide
  • Homo sapiens = Foppish man
  • horror = grir
  • ignorance = unskill, unwittleness
  • impossible = unmightly
  • incorrect = unyearight
  • increase = formore, bemoren
  • independence = unoffhangingness
  • indiscriminately = shilly-shally, allwardly
  • infancy = babytime
  • intoxication = bedrunkenhood
  • invasion = inslaught
  • jolly = full of beans
  • juggernaut = blindblooter
  • kamikaze = selfkilling loftstrike
  • kangaroo = hopdeer
  • laser = lesyr (light eking by spurred yondputting of rodding)
  • limerence = crush
  • lumpenproletariat = underrabble
  • lysosome = selfkillbag
  • malicious = argh, evilful
  • maltreat = misnutt
  • mammal = suckdeer, suckledeer
  • March = Winmonth
  • marsupial = pungsucker
  • martyr = bloot
  • megalopolis = mickleborough
  • mercy = milds
  • mitochondrion = mightcorn
  • mock = geck, betwit
  • nanotechnology = motish witcraft, smartdust
  • natural selection = unmanmade sieving
  • nostalgia = yesterlove
  • nursery = childergarden
  • ocean = the great sea, the blue moor, sailroad, the brine
  • old-fashioned = old-fangled
  • orchid = wombbloom
  • palindrome = drowword
  • pervert = lewdster
  • pianoforte = softhard keyboard
  • pregnancy = childladenhood
  • prehistory = aforeyorelore, yesteryore
  • quid pro quo = tit for tat
  • revolution = whirft, umbewrithing
  • romanticism = lovecraft, storm-and-throng-troth
  • sagacious = hyesnotter, sarrowthankle, wisehidey, yarewittle
  • satire = scoldcraft
  • scarab = turd-weevil
  • science = learncraft, the knowledges
  • second = twoth
  • somnolent = sloomy
  • spirit = poost
  • sublingual salivary glands = undertungish spittlethreezen
  • sugar = beeless honey
  • tabernacle = worship booth
  • underpants = netherbritches
  • undulating = wimpling
  • unintelligent = unthinkle
  • usurer = wastomhatster, wookerer
  • velociraptor = dashsnatcher
  • volcano = fireberg, welkinoozer
  • vowel = stevening
  • voyage = farfare
  • walrus = horsewhale

__________________________________________________________________________

You have been gifted a new Dadaist superpower. I release you unto the world with it.

Follow

Get every new post delivered to your Inbox.

Join 49 other followers