Are Musk’s Mars spaceship problems problems?

Elon Musk is planning to start a colony on Mars. Jason Torchinsky proposed some improvements to Musk’s proposed spaceship design, but some commenters on social media questioned Torchinsky’s proposals. I’ve reproduced these comments below, so that I can link to them more easily.

Amateur rocket engineer Evan Daniel writes:

1) I’m not sure how luxurious the actual craft will be. It should clearly be more luxurious than Apollo, to keep the passengers sane. But it being more spartan than Elon was talking about, especially early, seems likely.

2) Elon clearly likes the simplicity of only one upper stage hull design. The cargo, passenger, and fuel versions share a hull. This makes a great deal of sense for version one to me. Adding a second, third, or fourth major ship type is for later.

2a) If the author is keen on their hab module thing, they might as well go all the way to a cycler, which plenty of other people have talked about. I’m confused by them not mentioning this along with the L1 garage idea, given that it should further save propellant.

3) That means you don’t actually want to shrink the passenger ship. Sure you could build everything, first stage and fuel stages included, to a smaller scale… but the large scale is part of why it will be cheap per passenger or per pound. So “more spartan” translates as “more passengers” or “more cargo on board”, not “smaller ship”.

3a) That combined with number 2 might mean that open space is weirdly cheaper than you’d think. I’d have to investigate in more detail (aka break out the spreadsheets) to be sure. I’m not certain on this. (Mass for stuff is definitely still as pricey as you think, but the passenger version might have “too much volume” because the fuel-carrying version needs it for tanks and they share a hull.)

4) On-orbit transfer of people is complicated. Propellant transfer is far less so. The ITS as proposed is actually a very conservative design in some ways; these changes are less so. In particular they cost development money in an attempt to save operating costs, while making operations more complex. This seems misguided, given that SpaceX is probably short on funds for development (relatively speaking). Before you call for making operations more complex, think hard about the F9H schedule slips (while noting that F9H should be cheaper per pound launched than F9).

Anyway, I definitely don’t have enough info to say who is right. But I definitely know enough to think these proposed changes are not obviously a good thing. Especially the parts that advocate for more complex development and operations in an attempt to reduce operating costs. I’ll put my money on the SpaceX crew mostly knowing what they’re talking about in this case. (I’d also put money on the ITS having meaningful changes from this before its first passenger-carrying Mars flight, but I suspect they won’t be the ones listed here.)

Or, more simply: the author didn’t pay enough attention to the most relevant slide of all:

f7biktl

The dominant cost of the flight to Mars is the ship that goes to Mars. Not the stage to launch it, not the other launches to fuel it. The dominant cost of that ship is the development and construction cost. If your “cost saving” measures are saving elsewhere by making that budget item bigger, you’re probably doing it wrong.

James Tillman writes:

Well, one difficulty with the proposed “space only” module is that you cannot aerobrake with it. The proposed Muskian solution is to aerobrake as you’re entering the Mars atmosphere, and decrease overall fuel requirements thus for going from Earth to Mars; you can also aerobrake when entering Earth’s atmosphere, and decrease fuel requirements for going from Mars to Earth. So that’s a big cost. This solution will require fuel for braking while going Earth to Mars, and Mars to Earth, for both the hab and the lander.

Someone raises this point in the comments, and [Torchinsky] says, “throw an aerobraking shell on it,” but I’m not sure it’s that simple. An inflatable structure designed for zero g–which is the whole attraction of the concept–wouldn’t work well aerobraking, I’d guess.

The second difficulty is re fuel requirements. Decreasing aerobraking means that you’ll need more fuel for the Mars to Earth rendevous, and that makes me wonder if the smaller ship will have difficulty getting enough fuel through in-situ methane generation to both launch from Mars, push the larger structure to Earth, then brake the larger structure and itself so they’re in orbit. So you’re dealing with potentially (absolutely) increased fuel requirements, unless you can cut the mass by enough, while you’re also decreasing the amount of fuel you bring from the Martian surface.

Oh, re the concern with difficulty landing the ship on Mars–Mars dust storms are extremely weak (least accurate part of The Martian), compared to earth storms, so that would be no problem.

Suggestion re launching the tanker before the crewed ship–this has been raised, I think Musk has said they might do this, unsure of advantages and disadvantages.

Of course I’d need to do the math to really know anything about this.

And mathematician Jonathan Lee adds:

The Soyuz is lighter than the Apollo capsules because the Orbital and instrument modules did not have to be recovered successfully; they burned up on Earth entry. It’s an approach that’s useful when you do not need everything.

The object of the ITS is to, well, get one’s ass to Mars. The galleys, toilets etc. (which are not actually the main mass drivers but let us not have mere facts get in the way) need to get down onto the surface, at least until the crew have built out a bunch of extra infrastructure. This needs to go there. Fuel on Mars is pretty explicitly not precious; the entire architecture is based around solar driving electrolysis and Sabatier reactions to make fuel. You actually want the lander to have enough delta-V to push the entire Mars – Earth return stack back to Earth. Scale is actually important here. They are ultimately not trying to get minimum cost flags-and-footprints. They are trying to build an architecture that reliably moves 200-500 tonnes of stuff to Mars. That is the point. To get enough mass there that you can build a self sustaining colony in a place colder, dryer and with less atmosphere than Antarctica. They need the mass moved.

Oh, winds. Ha. Yeah, so the atmosphere on Mars is about 1% the density of that of Earth. Wind forces go like the square of velocity, so you can convert Mars winds to Earth winds by dividing velocity by 10. The highest wind speeds that I can find reported for Mars are about 90mph (solid hurricane territory). So the dynamic pressures are about that of a 10mph wind on Earth. So, no, wind is not a problem.

The in-transit model being proposed cannot aerobrake, cannot even pretend to provide radiation protection, and means that now you need to have orbital assembly at both Mars and Earth on every launch. Oh, and have those docked connections take the not-inconsiderable forces of the injection burns each way.

I asked aerospace engineer Lloyd Strohl III to review this post, and he affirmed that Daniel and Lee’s comments here look correct.

Library of Scott Alexandria

I’ve said before that my favorite blog — and the one that’s shifted my views in the most varied and consequential ways — is Scott Alexander’s Slate Star Codex. Scott has written a lot of good stuff, and it can be hard to know where to begin; so I’ve listed below what I think are the best pieces for new readers to start with. This includes older writing, e.g., from Less Wrong.

The list should make the most sense to people who start from the top and read through it in order, though skipping around is encouraged too — many of the posts are self-contained. The list isn’t chronological. Instead, I’ve tried to order things by a mix of “where do I think most people should start reading?” plus “sorting related posts together.” If stuff doesn’t make sense, you may want to Google terms or read background material in Rationality: From AI to Zombies.

This is a work in progress; you’re invited to suggest things you’d add, remove, or shuffle around.

__________________________________________________

I. Rationality and Rationalization
○   Blue- and Yellow-Tinted Choices
○   The Apologist and the Revolutionary
○   Historical Realism
○   Simultaneously Right and Wrong
○   You May Already Be A Sinner
○   Beware the Man of One Study
○   Debunked and Well-Refuted
○   How to Not Lose an Argument
○   The Least Convenient Possible World
○   Bayes for Schizophrenics: Reasoning in Delusional Disorders
○   Generalizing from One Example
○   Typical Mind and Politics

II. Probabilism
○   Confidence Levels Inside and Outside an Argument
○   Schizophrenia and Geomagnetic Storms
○   Talking Snakes: A Cautionary Tale
○   Arguments from My Opponent Believes Something
○   Statistical Literacy Among Doctors Now Lower Than Chance
○   Techniques for Probability Estimates
○   On First Looking into Chapman’s “Pop Bayesianism”
○   Utilitarianism for Engineers
○   If It’s Worth Doing, It’s Worth Doing with Made-Up Statistics
○   Marijuana: Much More Than You Wanted to Know
○   Are You a Solar Deity?
○   The “Spot the Fakes” Test
○   Epistemic Learned Helplessness

III. Science and Doubt
○   Google Correlate Does Not Imply Google Causation
○   Stop Confounding Yourself! Stop Confounding Yourself!
○   Effects of Vertical Acceleration on Wrongness
○   90% Of All Claims About The Problems With Medical Studies Are Wrong
○   Prisons are Built with Bricks of Law and Brothels with Bricks of Religion, But That Doesn’t Prove a Causal Relationship
○   Noisy Poll Results and the Reptilian Muslim Climatologists from Mars
○   Two Dark Side Statistics Papers
○   Alcoholics Anonymous: Much More Than You Wanted to Know
○   The Control Group Is Out Of Control
○   The Cowpox of Doubt
○   The Skeptic’s Trilemma
○   If You Can’t Make Predictions, You’re Still in a Crisis

IV. Medicine, Therapy, and Human Enhancement
○   Scientific Freud
○   Sleep – Now by Prescription
○   In Defense of Psych Treatment for Attempted Suicide
○   Who By Very Slow Decay
○   Medicine, As Not Seen on TV
○   Searching for One-Sided Tradeoffs
○   Do Life Hacks Ever Reach Fixation?
○   Polyamory is Boring
○   Can You Condition Yourself?
○   Wirehead Gods on Lotus Thrones
○   Don’t Fear the Filter
○   Transhumanist Fables

V. Introduction to Game Theory
○   Backward Reasoning Over Decision Trees
○   Nash Equilibria and Schelling Points
○   Introduction to Prisoners’ Dilemma
○   Real-World Solutions to Prisoners’ Dilemmas
○   Interlude for Behavioral Economics
○   What is Signaling, Really?
○   Bargaining and Auctions
○   Imperfect Voting Systems
○   Game Theory as a Dark Art

VI. Promises and Principles
○   Beware Trivial Inconveniences
○   Time and Effort Discounting
○   Applied Picoeconomics
○   Schelling Fences on Slippery Slopes
○   Democracy is the Worst Form of Government Except for All the Others Except Possibly Futarchy
○   Eight Short Studies on Excuses
○   Revenge as Charitable Act
○   Would Your Real Preferences Please Stand Up?
○   Are Wireheads Happy?
○   Guilt: Another Gift Nobody Wants

VII. Cognition and Association
○   Diseased Thinking: Dissolving Questions about Disease
○   The Noncentral Fallacy — The Worst Argument in the World?
○   The Power of Positivist Thinking
○   When Truth Isn’t Enough
○   Ambijectivity
○   The Blue-Minimizing Robot
○   Basics of Animal Reinforcement
○   Wanting vs. Liking Revisited
○   Physical and Mental Behavior
○   Trivers on Self-Deception
○   Ego-Syntonic Thoughts and Values
○   Approving Reinforces Low-Effort Behaviors
○   To What Degree Do We Have Goals?
○   The Limits of Introspection
○   Secrets of the Eliminati
○   Tendencies in Reflective Equilibrium
○   Hansonian Optimism

VIII. Doing Good
○   Newtonian Ethics
○   Efficient Charity: Do Unto Others…
○   The Economics of Art and the Art of Economics
○   A Modest Proposal
○   The Life Issue
○   What if Drone Warfare Had Come First?
○   Nefarious Nefazodone and Flashy Rare Side-Effects
○   The Consequentialism FAQ
○   Doing Your Good Deed for the Day
○   I Myself Am A Scientismist
○   Whose Utilitarianism?
○   Book Review: After Virtue
○   Read History of Philosophy Backwards
○   Virtue Ethics: Not Practically Useful Either
○   Last Thoughts on Virtue Ethics
○   Proving Too Much

IX. Liberty
○   The Non-Libertarian FAQ (aka Why I Hate Your Freedom)
○   A Blessing in Disguise, Albeit a Very Good Disguise
○   Basic Income Guarantees
○   Book Review: The Nurture Assumption
○   The Death of Wages is Sin
○   Thank You For Doing Something Ambiguously Between Smoking And Not Smoking
○   Lies, Damned Lies, and Facebook (Part 1 of ∞)
○   The Life Cycle of Medical Ideas
○   Vote on Values, Outsource Beliefs
○   A Something Sort of Like Left-Libertarian-ist Manifesto
○   Plutocracy Isn’t About Money
○   Against Tulip Subsidies
○   SlateStarCodex Gives a Graduation Speech

X. Progress
○   Intellectual Hipsters and Meta-Contrarianism
○   A Signaling Theory of Class x Politics Interaction
○   Reactionary Philosophy in an Enormous, Planet-Sized Nutshell
○   A Thrive/Survive Theory of the Political Spectrum
○   We Wrestle Not With Flesh And Blood, But Against Powers And Principalities
○   Poor Folks Do Smile… For Now
○   Apart from Better Sanitation and Medicine and Education and Irrigation and Public Health and Roads and Public Order, What Has Modernity Done for Us?
○   The Wisdom of the Ancients
○   Can Atheists Appreciate Chesterton?
○   Holocaust Good for You, Research Finds, But Frequent Taunting Causes Cancer in Rats
○   Public Awareness Campaigns
○   Social Psychology is a Flamethrower
○   Nature is Not a Slate. It’s a Series of Levers.
○   The Anti-Reactionary FAQ
○   The Poor You Will Always Have With You
○   Proposed Biological Explanations for Historical Trends in Crime
○   Society is Fixed, Biology is Mutable

XI. Social Justice
○   Practically-a-Book Review: Dying to be Free
○   Drug Testing Welfare Users is a Sham, But Not for the Reasons You Think
○   The Meditation on Creepiness
○   The Meditation on Superweapons
○   The Meditation on the War on Applause Lights
○   The Meditation on Superweapons and Bingo
○   An Analysis of the Formalist Account of Power Relations in Democratic Societies
○   Arguments About Male Violence Prove Too Much
○   Social Justice for the Highly-Demanding-of-Rigor
○   Against Bravery Debates
○   All Debates Are Bravery Debates
○   A Comment I Posted on “What Would JT Do?”
○   We Are All MsScribe
○   The Spirit of the First Amendment
○   A Response to Apophemi on Triggers
○   Lies, Damned Lies, and Social Media: False Rape Accusations
○   In Favor of Niceness, Community, and Civilization

XII. Politicization
○   Right is the New Left
○   Weak Men are Superweapons
○   You Kant Dismiss Universalizability
○   I Can Tolerate Anything Except the Outgroup
○   Five Case Studies on Politicization
○   Black People Less Likely
○   Nydwracu’s Fnords
○   All in All, Another Brick in the Motte
○   Ethnic Tension and Meaningless Arguments
○   Race and Justice: Much More Than You Wanted to Know
○   Framing for Light Instead of Heat
○   The Wonderful Thing About Triggers
○   Fearful Symmetry
○   Archipelago and Atomic Communitarianism

XIII. Competition and Cooperation
○   The Demiurge’s Older Brother
○   Book Review: The Two-Income Trap
○   Just for Stealing a Mouthful of Bread
○   Meditations on Moloch
○   Misperceptions on Moloch
○   The Invisible Nation — Reconciling Utilitarianism and Contractualism
○   Freedom on the Centralized Web
○   Book Review: Singer on Marx
○   Does Class Warfare Have a Free Rider Problem?
○   Book Review: Red Plenty

__________________________________________________

 

 

If you liked these posts and want more, I suggest browsing the Slate Star Codex archives.

The seed is not the superintelligence

This is the conclusion of a LessWrong post, following The AI Knows, But Doesn’t Care.

If an artificial intelligence is smart enough to be dangerous to people, we’d intuitively expect it to be smart enough to know how to make itself safe for people. But that doesn’t mean all smart AIs are safe. To turn that capacity into actual safety, we have to program the AI at the outset — before it becomes too fast, powerful, or complicated to reliably control — to already care about making its future self care about safety.

That means we have to understand how to code safety. We can’t pass the entire buck to the AI, when only an AI we’ve already safety-proofed will be safe to ask for help on safety issues! Generally: If the AI is weak enough to be safe, it’s too weak to solve this problem. If it’s strong enough to solve this problem, it’s too strong to be safe.

This is an urgent public safety issue, given the five theses and given that we’ll likely figure out how to make a decent artificial programmer before we figure out how to make an excellent artificial ethicist.

File:Ouroboros-Zanaq.svg

The AI’s trajectory of self-modification has to come from somewhere.

“Take an AI in a box that wants to persuade its gatekeeper to set it free. Do you think that such an undertaking would be feasible if the AI was going to interpret everything the gatekeeper says in complete ignorance of the gatekeeper’s values? […] I don’t think so. So how exactly would it care to follow through on an interpretation of a given goal that it knows, given all available information, is not the intended meaning of the goal? If it knows what was meant by ‘minimize human suffering’ then how does it decide to choose a different meaning? And if it doesn’t know what is meant by such a goal, how could it possible [sic] convince anyone to set it free, let alone take over the world?”
               —Alexander Kruel
“If the AI doesn’t know that you really mean ‘make paperclips without killing anyone’, that’s not a realistic scenario for AIs at all–the AI is superintelligent; it has to know. If the AI knows what you really mean, then you can fix this by programming the AI to ‘make paperclips in the way that I mean’.”
               Jiro

The wish-granting genie we’ve conjured — if it bothers to even consider the question — should be able to understand what you mean by ‘I wish for my values to be fulfilled.’ Indeed, it should understand your meaning better than you do. But superintelligence only implies that the genie’s map can compass your true values. Superintelligence doesn’t imply that the genie’s utility function has terminal values pinned to your True Values, or to the True Meaning of your commands.

The critical mistake here is to not distinguish the seed AI we initially program from the superintelligent wish-granter it self-modifies to become. We can’t use the genius of the superintelligence to tell us how to program its own seed to become the sort of superintelligence that tells us how to build the right seed. Time doesn’t work that way.

We can delegate most problems to the FAI. But the one problem we can’t safely delegate is the problem of coding the seed AI to produce the sort of superintelligence to which a task can be safely delegated.

When you write the seed’s utility function, you, the programmer, don’t understand everything about the nature of human value or meaning. That imperfect understanding remains the causal basis of the fully-grown superintelligence’s actions,long after it’s become smart enough to fully understand our values.

Why is the superintelligence, if it’s so clever, stuck with whatever meta-ethically dumb-as-dirt utility function we gave it at the outset? Why can’t we just pass the fully-grown superintelligence the buck by instilling in the seed the instruction: ‘When you’re smart enough to understand Friendliness Theory, ditch the values you started with and just self-modify to become Friendly.’?

Because that sentence has to actually be coded in to the AI, and when we do so, there’s no ghost in the machine to know exactly what we mean by ‘frend-lee-ness thee-ree’. Instead, we have to give it criteria we think are good indicators of Friendliness, so it’ll know what to self-modify toward. And if one of the landmarks on our ‘frend-lee-ness’ road map is a bit off, we lose the world.

Yes, the UFAI will be able to solve Friendliness Theory. But if we haven’t already solved it on our own power, we can’tpinpoint Friendliness in advance, out of the space of utility functions. And if we can’t pinpoint it with enough detail to draw a road map to it and it alone, we can’t program the AI to care about conforming itself with that particular idiosyncratic algorithm.

Yes, the UFAI will be able to self-modify to become Friendly, if it so wishes. But if there is no seed of Friendliness already at the heart of the AI’s decision criteria, no argument or discovery will spontaneously change its heart.

And, yes, the UFAI will be able to simulate humans accurately enough to know that its own programmers would wish, if they knew the UFAI’s misdeeds, that they had programmed the seed differently. But what’s done is done. Unless we ourselves figure out how to program the AI to terminally value its programmers’ True Intentions, the UFAI will just shrug at its creators’ foolishness and carry on converting the Virgo Supercluster’s available energy into paperclips.

And if we do discover the specific lines of code that will get an AI to perfectly care about its programmer’s True Intentions, such that it reliably self-modifies to better fit them — well, then that will just mean that we’ve solved Friendliness Theory. The clever hack that makes further Friendliness research unnecessary is Friendliness.

Not all small targets are alike.

“You write that the worry is that the superintelligence won’t care. My response is that, to work at all, it will have to care about a lot. For example, it will have to care about achieving accurate beliefs about the world. It will have to care to devise plans to overpower humanity and not get caught. If it cares about those activities, then how is it more difficult to make it care to understand and do what humans mean? […]
“If an AI is meant to behave generally intelligent [sic] then it will have to work as intended or otherwise fail to be generally intelligent.”
            Alexander Kruel

It’s easy to get a genie to care about (optimize for) something-or-other; what’s hard is getting one to care about the right something.

‘Working as intended’ is a simple phrase, but behind it lies a monstrously complex referent. It doesn’t clearly distinguish the programmers’ (mostly implicit) true preferences from their stated design objectives; an AI’s actual code can differ from either or both of these. Crucially, what an AI is ‘intended’ for isn’t all-or-nothing. It can fail in some ways without failing in every way, and small errors will tend to kill Friendliness much more easily than intelligence.

It may be hard to build self-modifying AGI. But it’s not the same hardness as the hardness of Friendliness Theory. Being able to hit one small target doesn’t entail that you can or will hit every small target it would be in your best interest to hit. Intelligence on its own does not imply Friendliness. And there are three big reasons to think that AGI may arrive before Friendliness Theory is solved:

(i) Research Inertia. Far more people are working on AGI than on Friendliness. And there may not come a moment when researchers will suddenly realize that they need to take all their resources out of AGI and pour them into Friendliness. If the status quo continues, the default expectation should be UFAI.

(ii) Disjunctive Instrumental Value. Being more intelligent — that is, better able to manipulate diverse environments — is of instrumental value to nearly every goal. Being Friendly is of instrumental value to barely any goals. This makes it more likely by default that short-sighted humans will be interested in building AGI than in developing Friendliness Theory. And it makes it much likelier that an attempt at Friendly AGI that has a slightly defective goal architecture will retain the instrumental value of intelligence than of Friendliness.

(iii) Incremental Approachability. Friendliness is an all-or-nothing target. Value is fragile and complex, and a half-good being editing its morality drive is at least as likely to move toward 40% goodness as 60%. Cross-domain efficiency, in contrast, is not an all-or-nothing target. If you just make the AGI slightly better than a human at improving the efficiency of AGI, then this can snowball into ever-improving efficiency, even if the beginnings were clumsy and imperfect. It’s easy to put a reasoning machine into a feedback loop with reality in which it is differentially rewarded for being smarter; it’s hard to put one into a feedback loop with reality in which it is differentially rewarded for picking increasingly correct answers to ethical dilemmas.

The ability to productively rewrite software and the ability to perfectly extrapolate humanity’s True Preferences are two different skills. (For example, humans have the former capacity, and not the latter. Most humans, given unlimited power, would be unintentionally Unfriendly.)

It’s true that a sufficiently advanced superintelligence should be able to acquire both abilities. But we don’t have them both, and a pre-FOOM self-improving AGI (‘seed’) need not have both. Being able to program good programmers is all that’s required for an intelligence explosion; but being a good programmer doesn’t imply that one is a superlative moral psychologist or moral philosopher.

If the programmers don’t know in mathematical detail what Friendly code would even look like, then the seed won’t be built to want to build toward the right code. And if the seed isn’t built to want to self-modify toward Friendliness, then the superintelligence it sproutsalso won’t have that preference, even though — unlike the seed and its programmers — the superintelligence does have the domain-general ‘hit whatever target I want’ ability that makes Friendliness easy.

And that’s why some people are worried.

A non-technical introduction to AI risk

In the summer of 2008, experts attending the Global Catastrophic Risk Conference assigned a 5% probability to the human species’ going extinct due to “superintelligent AI” by the year 2100. New organizations, like the Centre for the Study of Existential Risk and the Machine Intelligence Research Institute, are springing up to face the challenge of an AI apocalypse. But what is artificial intelligence, and why do people think it’s dangerous?

As it turns out, studying AI risk is useful for gaining a deeper understanding of philosophy of mind and ethics, and a lot of the general theses are accessible to non-experts. So I’ve gathered here a list of short, accessible, informal articles, mostly written by Eliezer Yudkowsky, to serve as a philosophical crash course on the topic. The first half will focus on what makes something intelligent, and what an Artificial General Intelligence is. The second half will focus on what makes such an intelligence ‘friendly‘ — that is, safe and useful — and why this matters.

____________________________________________________________________________

Part I. Building intelligence.

An artificial intelligence is any program or machine that can autonomously and efficiently complete a complex task, like Google Maps, or a xerox machine. One of the largest obstacles to assessing AI risk is overcoming anthropomorphism, the tendency to treat non-humans as though they were quite human-like. Because AIs have complex goals and behaviors, it’s especially difficult not to think of them as people. Having a better understanding of where human intelligence comes from, and how it differs from other complex processes, is an important first step in approaching this challenge with fresh eyes.

1. Power of Intelligence. Why is intelligence important?

2. Ghosts in the Machine. Is building an intelligence from scratch like talking to a person?

3. Artificial Addition. What can we conclude about the nature of intelligence from the fact that we don’t yet understand it?

4. Adaptation-Executers, not Fitness-Maximizers. How do human goals relate to the ‘goals’ of evolution?

5. The Blue-Minimizing Robot. What are the shortcomings of thinking of things as ‘agents’, ‘intelligences’, or ‘optimizers’ with defined values/goals/preferences?

Part II. Intelligence explosion.

Forecasters are worried about Artificial General Intelligence (AGI), an AI that, like a human, can achieve a wide variety of different complex aims. An AGI could think faster than a human, making it better at building new and improved AGI — which would be better still at designing AGI. As this snowballed, AGI would improve itself faster and faster, become increasingly unpredictable and powerful as its design changed. The worry is that we’ll figure out how to make self-improving AGI before we figure out how to safety-proof every link in this chain of AGI-built AGIs.

6. Optimization and the Singularity. What is optimization? As optimization processes, how do evolution, humans, and self-modifying AGI differ?

7. Efficient Cross-Domain Optimization. What is intelligence?

8. The Design Space of Minds-In-General. What else is universally true of intelligences?

9. Plenty of Room Above Us. Why should we expect self-improving AGI to quickly become superintelligent?

Part III. AI risk.

In the Prisoner’s Dilemma, it’s better for both players to cooperate than for both to defect; and we have a natural disdain for human defectors. But an AGI is not a human; it’s just a process that increases its own ability to produce complex, low-probability situations. It doesn’t necessarily experience joy or suffering, doesn’t necessarily possess consciousness or personhood. When we treat it like a human, we not only unduly weight its own artificial ‘preferences’ over real human preferences, but also mistakenly assume that an AGI is motivated by human-like thoughts and emotions. This makes us reliably underestimate the risk involved in engineering an intelligence explosion.

10. The True Prisoner’s Dilemma. What kind of jerk would Defect even knowing the other side Cooperated?

11. Basic AI drives. Why are AGIs dangerous even when they’re indifferent to us?

12. Anthropomorphic Optimism. Why do we think things we hope happen are likelier?

13. The Hidden Complexity of Wishes. How hard is it to directly program an alien intelligence to enact my values?

14. Magical Categories. How hard is it to program an alien intelligence to reconstruct my values from observed patterns?

15. The AI Problem, with Solutions. How hard is it to give AGI predictable values of any sort? More generally, why does AGI risk matter so much?

Part IV. Ends.

A superintelligence has the potential not only to do great harm, but also to greatly benefit humanity. If we want to make sure that whatever AGIs people make respect human values, then we need a better understanding of what those values actually are. Keeping our goals in mind will also make it less likely that we’ll despair of solving the Friendliness problem. The task looks difficult, but we have no way of knowing how hard it will end up being until we’ve invested more resources into safety research. Keeping in mind how much we have to gain, and to lose, advises against both cynicism and complacency.

16. Could Anything Be Right? What do we mean by ‘good’, or ‘valuable’, or ‘moral’?

17. Morality as Fixed Computation. Is it enough to have an AGI improve the fit between my preferences and the world?

18. Serious Stories. What would a true utopia be like?

19. Value is Fragile. If we just sit back and let the universe do its thing, will it still produce value? If we don’t take charge of our future, won’t it still turn out interesting and beautiful on some deeper level?

20. The Gift We Give To Tomorrow. In explaining value, are we explaining it away? Are we making our goals less important?

In conclusion, a summary of the core argument: Five theses, two lemmas, and a couple of strategic implications.

____________________________________________________________________________

If you’re convinced, MIRI has put together a list of ways you can get involved in promoting AI safety research. You can also share this post and start conversations about it, to put the issue on more people’s radars. If you want to read on, check out the more in-depth articles below.

____________________________________________________________________________

Further reading

Evolution: Six myths

MYTH 1: Evolution is just a theory, not a fact.

When I say I have a theory, it means that I have a guess, a conjecture. But when a scientist says she has a theory, it means that she has a working explanation for a large set of facts. When we confuse these two senses of ‘theory’, we can misunderstand the scientific standing of the ‘theory’ of evolution.

In science, a theory is one of the most sturdy and well-tested ideas, rather than one of the least. Strictly speaking, a scientific theory can never become a ‘fact’ no matter how well-supported it is, because a theory is an overarching explanation rather than a mere observation. Thus, the idea that matter is made of atoms is still the ‘atomic theory’, and the idea that microorganisms cause disease is still the ‘germ theory’.

Because theories make predictions about what will happen in the future, they can be tested and refined over time. Around the 1930s, Darwin’s original theory was replaced by the modern synthetic theory. This “neo-Darwinian” theory incorporated Gregor Mendel’s account of how offspring inherit traits. Mendelian genetics has helped produce the scientific definition of the actual observed process of evolution — as ‘change in a population’s inherited traits’. A common source of confusion is mixing up the physical process, ‘evolution’, with ‘the theory of evolution’ (with explains the process). The process — which can be seen every time children aren’t exact clones of their parents! — can be called a ‘fact’ in the strict sense, whereas the theory of evolution is only a ‘fact’ in the looser sense of being ‘something we know is true’.

So what does the theory actually tell us?

In biology, evolution is the change in a population’s inheritable traits from generation to generation. It boils down to 4 core ideas:

  1. Heredity. Parents pass on their traits to offspring.
  2. Variation. Offspring differ slightly from their parents, and from each other.
  3. Fitness. Some of these differences are more helpful for reproducing than others.
  4. Selection. Offspring with more helpful traits will in turn have more offspring, making the traits more common in the population.

Over time, this simple process of small incremental changes can have dramatic results. As traits become more common or rare in the population over millions of years, a species gradually changes, either randomly or by the environment’s selection of certain helpful traits, into a new species — or branches off into several. This process is called speciation.

MYTH 2: Evolution teaches that we should live by ‘survival of the fittest’.

‘Survival of the fit’ is a better characterization of Darwin’s theory of selection than ‘survival of the fittest’, a phrase coined by the social theorist Herbert Spencer. Selection simply states that whatever organisms ‘fit’ their environments will survive. When resources are limited, competition to survive certainly plays a role — but cooperation does too.

The idea of ‘survival of the fittest’ has been used to justify Social ‘Darwinism’, which amounts to a philosophy of ‘every man for himself’. However, a number of biological misunderstandings underlie the notion that pure selfishness helps the group or the individual. First, even if this were true in nature, that would not automatically make it a good thing — germs cause disease in nature, yet that doesn’t mean we should try to make ourselves sick. Do we want to emulate nature’s brutality, or mitigate it?

Second, the ‘fittest’ species are often the best cooperators — symbiotes, colonies of insects and bacteria, schools of fish, flocks of birds, herds and packs of mammals, and of course societies of humans. Moreover, most ‘weaknesses’ are not genetic, nor so severe that they make the individual unable to contribute to society.

‘Fitness’ also changes depending on the environment. There is no context-free measure of fitness, and what is ‘weak’ today may be ‘strong’ tomorrow. Mammals were ‘weak’ when dinosaurs dominated the planet, but strong afterwards. Which brings us to the next myth…

MYTH 3: Evolution is progress.

When we speak of something’s ‘evolving’, we usually mean that it is improving. But in biology, this is not the case. Most evolution is neutral — organisms simply change randomly, by mutation and other processes, without even changing in fitness. And although evolution can never be harmful in the short term, since a harmful trait by definition won’t be selected, the problem is that evolution only cares about the short term. Although in the long run small evolutionary improvements can add up to massive advantages, it’s also possible for short-sighted, immediate benefits to evolve which doom a species in the long run. This is especially common if the environment changes.

The illusion of progress is created because all the evolutionary ‘dead ends’ tend to end up, well, dead — dead as the dodo. But the idea that evolution has any long-term ‘goals’ in mind derives from us not noticing two things. First, we don’t notice how meandering evolution is — an animal might become slightly larger one century, slightly smaller the next. And the reason neither process amounts to ‘evolving backward’ is because neither process is ‘evolving forward’ either — all hereditary change is evolution, regardless of ‘direction’.

Second, although we enjoy thinking of ourselves as the ‘goal’ of evolution, we don’t think about the hundreds of millions of other species that were perfectly happy evolving into organisms radically different from us. Bacteria make up the majority of life’s diversity; if intelligent bipeds were the aim of all evolution, we would expect them to have evolved thousands of times, not just once.

MYTH 4: We evolved from monkeys.

You may have heard the question asked: ‘If humans descended from monkeys, why are there still monkeys?’ Next time you hear this, feel free to reply: ‘If Australians descended from Europeans, why are there still Europeans?’

Biologists have never claimed that humans evolved from monkeys. Biologists do believe that humans and monkeys are related — but as cousins, not parent and child.

But then, biologists also claim that all life is distantly related. This theory, common descent, is the real shocker: You’re not only related to monkeys, but to bananas as well! This is based on the fact that all life shares astonishing molecular and anatomical simliarities, and these commonalities seem to ‘cluster’ around otherwise-similar species, like lions and tigers. Just as DNA tests make it possible to determine how closely related two human beings are, so too, by the same principle, do they allow us to test how closely related two different species are.

In the case of humans, the molecular evidence suggests, not that we descended from monkeys, but that we shared a common ancestor with them tens of millions of years ago. This ancestor was neither a modern monkey nor a human, but a now-extinct primate. Humans and monkeys have both evolved a great deal since that time. We’ve just evolved in very different ways.

It should also be noted that we are much closer to the other apes than to true ‘monkeys’ (which have longer tails). Humans are classified in the great ape family, the hominids. This makes our closest living cousins the chimpanzee, gorilla, orangutan, and gibbon — but we didn’t evolve from them, any more than they evolved from us.

MYTH 5: Evolution is random.

It’s sometimes suggested that evolution is too ‘blind’ and ‘random’ to result in complicated structures. But natural selection is only ‘random’ in the sense that physical processes like gravity are ‘random’. Although the genetic differences between organisms derive in part from random mutations, natural selection nonrandomly ‘filters’ those differences based on how well they help the organism survive and reproduce in its environment. The overall process of evolution, therefore, isn’t simply random: Species change in particular ways for particular reasons, such as because of a new predator in the region, or for that matter because of the absence of a predator.

A related myth is the notion that evolutionary theory claims ‘life arose by chance’. This is not an aspect of the theory of evolution, which only describes how life changes after it has already originated. Instead, this is relevant to the study of abiogenesis, the origin of life.

MYTH 6: We still haven’t found a ‘missing link’.

It’s not always clear what this fabled ‘missing link’ is supposed to be, as thousands of fossils of early humans and hominids have been discovered. The problem is that every time a new fossil is found that fills a ‘gap’ in the evolutionary record, it just creates two new gaps — one right before the fossil species, and one right after.

Transitional fossils linking major groups of organisms are also abundant in the record. One of the most famous dates back to Darwin’s day: Archaeopteryx, a proto-bird with feathers, teeth, and clawed fingers. More recent examples include Tiktaalik, a fish with primitive limbs (including wrists) and a neck, predating the amphibians.

More to the point, the central lesson evolution has to teach us is that every organism is a ‘link’ — all life is connected, and every organism that has offspring is equally ‘transitional’, because life is constantly changing. The change is gradual, certainly, and seems minute on a human time scale — but one of the profoundest lessons science has to offer is that drops of water, given time, can hollow a stone.

For more information, see Talk.Origins.

What can we reasonably concede to unreason?

This post first appeared on the Secular Alliance at Indiana University blog.

In October, SAIU members headed up to Indianapolis for the Center for Inquiry‘s “Defending Science: Challenges and Strategies” workshop. Massimo Pigliucci and Julia Galef, co-hosts of the podcast Rationally Speaking, spoke about natural deficits in reasoning, while Jason Rodriguez and John Shook focused on deliberate attempts to restrict scientific inquiry.

Julia Galef drew our attention to the common assumption that being rational means abandoning all intuition and emotion, an assumption she dismissed as a flimsy Hollywood straw man, or “straw vulcan”. True rationality, Julia suggested, is about the skillful integration of intuitive and deliberative thought. As she noted in a similar talk at the Singularity Summit, these skills demand constant cultivation and vigilance. In their absence, we all predictably fall victim to an array of cognitive biases.

To that end, Galef spoke of suites of indispensable “rationality skills”:

  • Know when to override an intuitive judgment with a reasoned one. Recognize cases where your intuition reliably fails, but also cases where intuition tends to perform better than reason.
  • Learn how to query your intuitive brain. For instance, to gauge how you really feel about a possibility, visualize it concretely, and perform thought experiments to test how different parameters and framing effects are influencing you.
  • Persuade your intuitive system of what your reason already knows. For example: Anna Salamon knew intellectually that wire-guided sky jumps are safe, but was having trouble psyching herself up. So she made her knowledge of statistics concrete, imagining thousands of people jumping before her eyes. This helped trick her affective response into better aligning with her factual knowledge.

Massimo Pigliucci’s talk, “A Very Short Course in Intellectual Self-Defense”, was in a similar vein. Pigliucci drew our attention to common formal and informal fallacies, and to the limits of deductive, inductive, and mathematical thought. Dissenting from Thomas Huxley’s view that ordinary reasoning is a great deal like science, Pigliucci argued that science is cognitively unnatural. This is why untrained reasoners routinely fail to properly amass and evaluate data.

While it’s certainly important to keep in mind how much hard work empirical rigor demands, I think we should retain a qualified version of Huxley’s view. It’s worth emphasizing that careful thought is not the exclusive property of professional academics, that the basic assumptions of science are refined versions of many of the intuitions we use in navigating our everyday environments. Science’s methods are rarefied, but not exotic or parochial. If we forget this, we risk giving too much credence to presuppositionalist apologetics.

Next, Jason Rodriguez discussed the tactics and goals of science organizations seeking to appease, work with, or reach out to the religious. Surveying a number of different views on the creation-evolution debate, Rodriguez questioned when it is more valuable to attack religious doctrines head-on, and when it is more productive to avoid conflict or make concessions.

This led in to John Shook’s vigorous talk, “Science Must Never Compromise With Religion, No Matter the Metaphysical or Theological Temptations”, and a follow-up Rationally Speaking podcast with Galef and Pigliucci. As you probably guessed, it focused on attacking metaphysicians and theologians who seek to limit the scope or undermine the credibility of scientific inquiry. Shook’s basic concern was that intellectuals are undermining the authority of science when they deem some facts ‘scientific’ and others ‘unscientific’. This puts undue constraints on scientific practice. Moreover, it gives undue legitimacy to those philosophical and religious thinkers who think abstract thought or divine revelation grant us access to a special domain of Hidden Truths.

Shook’s strongest argument was against attempts to restrict science to ‘the natural’. If we define ‘Nature’ in terms of what is scientifically knowable, then this is an empty and useless constraint. But defining the natural instead as the physical, or the spatiotemporal, or the unmiraculous, deprives us of any principled reason to call our research programs ‘methodologically naturalistic’. We could imagine acquiring good empirical evidence for magic, for miracles, even for causes beyond our universe. So science’s skepticism about such phenomena is a powerful empirical conclusion. It is not an unargued assumption or prejudice on the part of scientists.

Shook also argued that metaphysics does not provide a special, unscientific source of knowledge; the claims of metaphysicians are pure and abject speculation. I found this part of the talk puzzling. Metaphysics, as the study of the basic features of reality, does not seem radically divorced from theoretical physics and mathematics, which make similar claims to expand at least our pool of conditional knowledge, knowledge of the implications of various models. Yet Shook argued, not for embracing metaphysics as a scientific field, but for dismissing it as fruitless hand-waving.

Perhaps the confusion stemmed from a rival conception of ‘metaphysics’, not as a specific academic field, but as the general practice of drawing firm conclusions about ultimate reality from introspection alone — what some might call ‘armchair philosophy’ or ‘neoscholasticism’. Philosophers of all fields — and, for that matter, scientists — would do well to more fully internalize the dangers of excessive armchair speculation. But the criticism is only useful if it is carefully aimed. If we fixate on ‘metaphysics’ and ‘theology’ as the sole targets of our opprobrium, we risk neglecting the same arrogance in other guises, while maligning useful exploration into the contents, bases, and consequences of our conceptual frameworks. And if we restrict knowledge to science, we risk not only delegitimizing fields like logic and mathematics, but also putting undue constraints on science itself. For picking out a special domain of purported facts as ‘metaphysical’, and therefore unscientific, has exactly the same risks as picking out a special domain as ‘non-natural’ or ‘supernatural’.

To defend science effectively, we have to pick our battles with care. This clearly holds true in public policy and education, where it is most useful in some cases to go for the throat, in other cases to make compromises and concessions. But it also applies to our own personal struggles to become more rational, where we must carefully weigh the costs of overriding our unreasoned intuitions, taking a balanced and long-term approach. And it also holds in disputes over the philosophical foundations and limits of scientific knowledge, where the cost of committing ourselves to unusual conceptions of ‘science’ or ‘knowledge’ or ‘metaphysics’ must be weighed against any argumentative and pedagogical benefits.

This workshop continues to stimulate my thought, and continues to fuel my drive to improve science education. The central insight the speakers shared was that the practices we group together as ‘science’ cannot be defended or promoted in a vacuum. We must bring to light the psychological and philosophical underpinnings of science, or we will risk losing sight of the real object of our hope and concern.

Ends

═══════════════════════════

Poets say science takes away from the beauty of the stars — mere globs of gas atoms.

Nothing is “mere.”

I too can see the stars on a desert night, and feel them. But do I see less or more? The vastness of the heavens stretches my imagination — stuck on this carousel my little eye can catch one-million-year-old light. A vast pattern — of which I am a part — perhaps my stuff was belched from some forgotten star, as one is belching there. Or see them with the greater eye of Palomar, rushing all apart from some common starting point when they were perhaps all together.

What is the pattern, or the meaning, or the why? It does not do harm to the mystery to know a little about it. For far more marvelous is the truth than any artists of the past imagined! Why do the poets of the present not speak of it? What men are poets who can speak of Jupiter if he were like a man, but if he is an immense spinning sphere of methane and ammonia must be silent?

═══════════════════════════

Nothing is mere?

Nothing? That can’t be right. One might as well proclaim that nothing is big. Or that nothing is undelicious.

What could that even mean? It sounds… arbitrary. Frivolous. An insult to the extraordinary.

But there’s a whisper of a lesson here. Value is arbitrary. It’s just what moves us. And the stars are lawless. And they nowhere decree what we ought to weep for, fight for, rejoice in. Love and terror, nausea and grace — these are born in us, not in the lovely or the terrible. ‘Arbitrary’ itself first meant ‘according to one’s will’. And by that standard nothing could be more arbitrary than the will itself.

Richard Feynman saw that mereness comes from our attitudes, our perspectives on things.  And those can change. (With effort, and with time.) Sometimes the key to appreciating the world is to remake it in our image, draw out of it an architecture deserving our reverence and joy. But sometimes the key is to reshape ourselves. Sometimes the things we should prize are already hidden in the world, and we have only to unblind ourselves to some latent dimension of merit.

Our task of tasks is to create a correspondence between our values and our world. But to do that, we must bring our values into harmony with themselves. And to do that, we must come to know ourselves.

Through Nothing Is Mere, I want to come to better understand the relationship between the things we care about and the things we believe. The topics I cover will vary wildly, but should all fall under four humanistic umbrellas.

  • Epistemology: What is it reasonable for us to believe? How do we make our beliefs more true, and why does truth matter?
  • Philosophy of Mind: What are we? Can we rediscover our most cherished and familiar concepts of ourselves in the great unseeing cosmos?
  • Value Theory: What is the nature of our moral, prudential, aesthetic, and epistemic norms? Which of our values run deepest?
  • Applied Philosophy: What now? How do we bring all of the above to bear on our personal development, our relationships, our discourse, our political and humanitarian goals?

Saying a little about my background in existential philosophy should go a long way toward explaining why I’m so interested in the project of humanizing Nature, and of naturalizing our humanity.

Two hundred years ago yesterday, the Danish theologian Søren Kierkegaard was born. SK was a reactionary romantic, a navel-gazing amoralist, an anti-scientific irrationalist, a gadfly, a child. But, for all that, he came to wisdom in a way very few do.

It sounds strange, but the words his hands penned taught me how to take my own life seriously. He forced me to see that my life’s value, at each moment, had to come from itself. And that it did. I really do care for myself, and I care for this world, and I need no one’s permission, no authority’s approval, to render my values legitimate.

SK feared the furious apathy of the naturalists, the Hegelians, the listless Christian throngs. He saw with virtuosic clarity the subjectivity of value, saw the value of subjectivity, saw the value of value itself. He saw that it is a species of madness to refuse in any way to privilege your own perspective, to value scientific objectivity so completely that the human preferences that make that objectivity worthwhile get lost in a fog, objectivity becoming an end in itself rather than a tool for realizing the things we cherish.

The path of objective reflection makes the subject accidental, and existence thereby into something indifferent, vanishing, Away from the subject, the path of reflection leads to the objective truth, and while the subject and his subjectivity become indifferent, the truth becomes that too, and just this is its objective validity; because interest, just like decision, is rooted in subjectivity. The path of objective reflection now leads to abstract thinking, to mathematics, to historical knowledge of various kinds, and always leads away from the subject, whose existence or non-existence becomes, and from the objective point of view quite rightly, infinitely indifferent[…. I]n so far as the subject fails to become wholly indifferent to himself, this only shows that his objective striving is not sufficiently objective.

But SK’s corrective was to endorse a rival lunacy. Fearing the world’s scientific mereness, its alien indifference, he fled from the world.

If there were no eternal consciousness in a man, if at the bottom of everything there were only a wild ferment, a power that twisting in dark passions produced everything great or inconsequential; if an unfathomable, insatiable emptiness lay hid beneath everything, what then would life be but despair? If it were thus, if there were no sacred bond uniting mankind, if one generation rose up after another like the leaves of the forest, if one generation succeeded the other as the songs of birds in the woods, if the human race passed through the world as a ship through the sea or the wind through the desert, a thoughtless and fruitless whim, if an eternal oblivion always lurked hungrily for its prey and there were no power strong enough to wrest it from its clutches — how empty and devoid of comfort would life be! But for that reason it is not so[.]

SK shared Feynman’s worry about the poet who cannot bring himself to embrace the merely real. He wanted to transform himself into the sort of person who could love himself, and love the world, purely and completely. But he simply couldn’t do it. So he cast himself before a God that would be for him the perfect lover, the perfect beloved, everything he wished he were.

[H]e sees in secret and recognizes distress and counts the tears and forgets nothing.

But everything moves you, and in infinite love. Even what we human beings call a trifle  and unmoved pass by, the sparrow’s need, that moves you; what we so often scarcely pay attention to, a human sigh, that moves you, Infinite Love.

To SK’s God, it all matters. But SK’s God is a God of solitude and self-deception. Striving for perfect Subjectivity leads to confusion and despair, just as surely as does striving for perfect, impersonal Objectivity. SK saw that we are the basis for the poetry of the world. What he sought in fantasy, we have now to discover — to create — in our shared world, our home.

Five years have passed, and I still return to Kierkegaard’s secret. He reminds me of what this is all for. We’re doing this for us, and it is we, at last, who must define our ends. I remain in his debt for that revelation. Asleep, I did not notice myself. Within a dream, I feel him shaking me awake with a terrifying urgency —— and I wake, and it is night, and I am alone again with the light of the stars.