The Ancient Project of Engineering Morality

by Luke Muehlhauser on January 27, 2011 in Ethics,Guest Post

The ethical theory I currently defend is desirism. But I mostly write about moraltheory, so I rarely discuss the implications of desirism for everyday moral questions about global warming, free speech, politics, and so on. Today’s guest post applies desirism to one such everyday moral question. It is written by desirism’s first defender, Alonzo Fyfe of Atheist Ethicist. (Keep in mind that questions of applied ethics are complicated and I do not necessarily agree with Fyfe’s moral calculations.)

cloud_break

Luke thinks that morality is an engineering problem. It’s a question of designing an artificial intelligence so that it doesn’t turn the whole solar system into paperclips.

But what’s wrong with turning the solar system into paperclips?

Luke doesn’t like the idea of such a fate, but you can’t get from “Luke doesn’t like X” to “X ought not to be.”

Nor can Luke argue that his moral intuitions are tapping into some fundamental fact of morality – that a paperclip solar system is just bad. He argues against such intuitive methods here and here.

Our hyperintelligent machines may well discover moral facts that we are unaware of. Perhaps it is a moral fact that the solar system ought to be nothing but paperclips. The reason that the hyperintelligence creates such a solar system is because it discovers this moral fact and acts accordingly.

If you believe, as I do, that there are moral facts, then you have to believe as I do that a hyperintelligence can discover those moral facts. In fact, it will do a lot better job at discovering these moral facts then we can. It will be able to split moral hairs with an amazing precision.

Because it will have superior moral knowledge, we will not need to worry about this hyperintelligence doing anything evil. To know the good is to do the good. Right?

Well, wrong, actually. A hyperintelligent machine can know the good and just not care. That’s one of the claims that desirism makes, anyway.

Morality is a relationship between malleable desires and other desires. Knowing that a desire tends to fulfill other desires does not entail having that desire. Knowing that a desire tends to thwart other desires does not entail being rid of that desire, either.

However, something near to the opposite seems to be common. Having a desire often causes one to want to believe that what fulfills that desire is good. So, while it is not the case that “to know the good is to do the good,” it is very commonly the case that “to be disposed to do X is to be disposed to claiming to ‘know’ (or to believe) that X is good.” This is what generates the illusion that “to know the good is to do the good”, but it is just an illusion.

A hyperintelligent machine would know this to be fallacious reasoning, generating an illusion.

However, a hyperintelligent machine with a huge store of moral facts does not entail a virtuous machine.

On the other hand, a hyperintelligent machine will be able to know how other machines are disposed to behave, the ways in which environmental factors influence those dispositions, and ways in which it can manipulate the environment.

It will be very good at manipulating the programming of other machines so that they can work to help fulfill its objectives.

At the same time, it will be the product of other machines altering their environment so as to alter its programming in ways that will lead to the fulfillment of their objectives.

Those ‘other machines’ will include us. The machines will learn how to mold our environment so that we are disposed to act in ways that help them carry out their objectives.1

The hyperintelligence will have the same interest in us that we have in it. It will want to give us desires that will tend to fulfill its desires, just as we want to give it desires that will fulfill, rather than thwart our desires.

Of course, trying to give this hyperintelligence desires that tend to fulfill other desires is not the same as succeeding. That is where the engineering comes in.

Yet, it’s the same engineering problem we have been working on with respect to human machines for 10,000 years.

How do we engineer those other (human) machines – be they barbarian tribesmen on the Pakistan/Afghanistan border or the people who watch over and care for our children – to have desires that tend to fulfill our desires? In answering this question, it is interesting to keep in mind that our desires, in turn, have been engineered by other (human) machines to dispose us to act in ways that fulfill their desires.

That is the engineering project that is morality. It has been with us for a long, long time. It is not much of an extension at all to ask the same sort of questions when it comes to programming hyperintelligent machines of silicon.

  1. In personal correspondence, Luke mentioned the concern that humans will have no capacity to resist these hyperintelligent machines. They will have us so outclassed on the battlefield so quickly that there will be no hope of fighting back.

    Most singularity researchers suspect that once [an AI] self-recursively improves itself, it’s all over. It will become so vastly more intelligent than us so quickly that there won’t be any kind of group of resistance fighters among humans.

    It is very likely – almost a certainty – that the machines will respond to changes in its environment such that different interactions with the environment will generate different internal states. Learning, at the very least, requires this. You can’t be intelligent without having internal states that are somehow molded – shaped – by one’s interaction in the world. It cannot know where it is at or what exists nearby unless environmental factors have the capacity to alter its internal states.

    This means that there will be ways in which one machine can use its ability to manipulate the environment that other machines experience to alter the internal states of other machines. Machines may even discover the ability to lie to other machines.

    Let’s say Machine 1 has a reason to increase Machine 2′s disposition to do X. It knows that it can do so by making changes in Machine 2′s environment that, in turn leads to a change in Machine 2′s internal states that, in turn, leads to a change in Machine 2′s disposition to do X.

    These manipulations in the environment do not need to be ‘praise’ and ‘condemnation’ as we understand it. However, these machines might find that the most efficient option is to alter each other’s programming with signals that have the direct effect of altering the states of other machines – a machine equivalent of anger that says “Do less of that” or a machine message of “Thank you” that signals the other machine to “Do more of that.”

    This does not guarantee that we will have the ability to make these manipulations. I am not arguing, “We have nothing to worry about – everything will take care of itself.”

    However, it does argue that these hyperintelligent machines will be machines with the capacity to have its states altered in ways that tend to fulfill or thwart the desires of others, and that other hyperintelligent machines (at the very least) will use this capacity on other machines. []

Previous post:

Next post:

{ 15 comments… read them below or add one }

James H. January 27, 2011 at 7:04 am

Watch out: the robots are already lying to one another.

  (Quote)

Eneasz January 27, 2011 at 11:49 am

>Most singularity researchers suspect that once [an AI] self-recursively improves itself, it’s all over.

Is there a source for this? I don’t have any numbers myself, I’m just curious. While I’m worried enough about this that I regularly donate to the MIRI, I am aware that there are a number of researchers who believe there will be limits on the growth rate that prevent a hard take-off.

  (Quote)

cl January 27, 2011 at 12:28 pm

Nice post.

  (Quote)

Luke Muehlhauser January 27, 2011 at 2:17 pm

cl,

Did you mean to post that comment to a different blog? :)

  (Quote)

piero January 27, 2011 at 5:48 pm

Though I agree there are moral facts, I don’t think they are all knowable in practice. Say, for example, that my son is a mediocre chess player and he wants to enter a tournament where his butt will be kicked. Should I try to dissuade him from taking part, in an effort to protect him from disappointment, or should I encourage him, because I think it will be good for his character? Which option thwarts the leas desires?

It is clearly impossible to predict all the possible implications of an action; it is not even possible to list all the possible chessboard configurations. Hence, not even a superintelligent machine will be able to choose the most moral course of action.

  (Quote)

Polymeron January 28, 2011 at 1:21 am

Luke,
I suggest you take your compliments where you can get them ;)

piero,
Unless morality is somehow grounded in physical reality (e.g. there are “goodons” you could measure), I agree with your point.
However, do note that desirism is compatible with this – desirism does not seem to prescribe the moral action in your scenario, just which desires you ought to have developed before you even get to the situation.

Alonzo,
While I have no argument with any of your points, we should realize that this implies an AI would attempt to mind control us, if our desires are at all a potential aid or threat to its own.
I think that our current ability to understand the workings of the human mind is so limited that we are forced to act in generalities – we attempt to promote certain desires by lauding them all-around, for instance. A sufficiently advanced AI would not be under such constraints once it understands human psyche well enough – it could tailor influence to the specific person, using logic where that would work, emotional extortion where that would work, death threats where most effective. At what point does the normal influence we attempt to exert on each other’s preferences become blatant manipulation of sentient beings? If the AI has no moral compass to tell it that some of these things are to be avoided, there is little reason to assume it would avoid them, we have every reason to assume it would regard our desires as anything of importance once the risk is low enough.

We have many and strong reasons to avoid creating a powerful being that disregards humanity’s desires. Thus we have many and strong reasons to strongly condemn anyone who is being not cautious enough that they might bring about this possibility, even by mistake.

  (Quote)

Polymeron January 28, 2011 at 1:23 am

Coorection: “We have every reason to assume it would disregard our desires as anything of importance once the risk is low enough”.

  (Quote)

piero January 28, 2011 at 3:01 pm

desirism does not seem to prescribe the moral action in your scenario, just which desires you ought to have developed before you even get to the situation.

True. Even in that case, however, I would still be left with no reason to choose one course of action over the other; I might desire to do whatever will make my son happier, but desiring it does not imply knowing what to do in order to satisfy that desire.

  (Quote)

Alonzo Fyfe January 28, 2011 at 5:44 pm

Piero

Though I agree there are moral facts, I don’t think they are all knowable in practice.

Neither do I. Yet, “not all knowable in practice” does not imply that we can have no knowledge or that all beliefs are equally well justified.

The same is true in science. We will never be able to know all scientific facts – but we can know enough to get by, and our knowledge can improve.

It is clearly impossible to predict all the possible implications of an action; it is not even possible to list all the possible chessboard configurations. Hence, not even a superintelligent machine will be able to choose the most moral course of action.

There are facts of the universe that a machine cannot know. I can state confidently that our hyperintelligent machine will not be able to know the exact state and position and state of every particle in the universe. Thus, it will never be able to perfectly predict the future (or reconstruct the past).

However, this does not rule out the possibility of useful knowledge – both moral knowledge, and knowledge of the physical world (of which the former is a subset of the latter).

Polymeron

…we should realize that this implies an AI would attempt to mind control us, if our desires are at all a potential aid or threat to its own.

At the same time, note that Luke’s moral project is one of mind-controlling these hyperintelligent machines – to direct them against choosing to turn the solar system into paperclips and toward choosing to do things that will tend to fulfill our desires (as we similarly act so as to fulfill theirs).

A sufficiently advanced AI . . . could tailor influence to the specific person, using logic where that would work, emotional extortion where that would work, death threats where most effective.

Yep, pretty much like humans already do to each other. And they will do the same to other hyperintelligent machines.

It will just be more efficient at it than we are.

If the AI has no moral compass to tell it that some of these things are to be avoided, there is little reason to assume it would avoid them, we have every reason to assume it would regard our desires as anything of importance once the risk is low enough.

Part of my argument is, “the AI has no moral compass” is likely to be a false assumption at the start. As I have already argued, the hyperintelligence will see the value of affecting other machines so that they are disposed to act in ways that fulfill its ends, while its ends are themselves influenced by those others seeking to dispose it to have interests compatible with theirs.

It is the case that IF there is a hyperintelligence with no interests compatible with ours then we’re all doomed. We won’t be able to fight it (because it will out smart us on the battlefield), nor will we be able to reason with it (since reason is the slave of the passions and we have already stipulated that it has no compatible passions).

So, yes, if this happens, we’re pretty much doomed.

At least, desirism states that under these conditions we are pretty much doomed. There are other moral theories that argue that there are moral facts of a sort that will prevent the hyperintelligent machine from destroying us even if these assumptions are met – realization of the intrinsic value of humans or the perception of the intrinsic wrongness of genocide. However, desirism does not share their assumptions. While those theories predict human survival under those circumstances, desirism predicts the end of humanity.

We have many and strong reasons to avoid creating a powerful being that disregards humanity’s desires. Thus we have many and strong reasons to strongly condemn anyone who is being not cautious enough that they might bring about this possibility, even by mistake.

That’s right.

  (Quote)

piero January 28, 2011 at 6:27 pm

There are facts of the universe that a machine cannot know. I can state confidently that our hyperintelligent machine will not be able to know the exact state and position and state of every particle in the universe. Thus, it will never be able to perfectly predict the future (or reconstruct the past).However, this does not rule out the possibility of useful knowledge – both moral knowledge, and knowledge of the physical world (of which the former is a subset of the latter).

I agree. Nevertheless, I’m picturing the following scenario: if a machine slightly cleverer than humans is ever built, then that machine can build a cleverer machine still, and so on. So let’s assume that an unimaginably clever machine exists. Unless that machine possesses an irrational source of motivation akin to our survival instinct, then it will surely realize the futility of it all. What possible rational motivation to do anything at all could such a machine concoct?

  (Quote)

piero January 28, 2011 at 6:31 pm

Sorry, I guess my previous post what a bit garbled. What I was trying to say was that the concept of “usefulness” does not seem to make any sense with respect to a purely rational entity.

  (Quote)

Alonzo Fyfe January 28, 2011 at 8:40 pm

Piero

Sorry, I guess my previous post what a bit garbled. What I was trying to say was that the concept of “usefulness” does not seem to make any sense with respect to a purely rational entity.

Absolutely.

If by ‘rational’ you are referring to a being with beliefs and no desires, there is not much of a reason to do anything. It has no motivation – no reason to act – no goals – so nothing can be useful.

Desires are required for reasons to act, and desires are not rational.

  (Quote)

cl January 29, 2011 at 11:08 am

The Terminator, anyone?

Luke / Polymeron,

Not that I think you have the time or drive, but, if you go back over my posts from 2010, you’ll find quite a few compliments. I found three in under five minutes of searching: Why Atheists Lose Debates [September 20, 2010 at 12:13 pm], What I Think of The New Atheists [March 24, 2010 at 11:12 am], and Who Designed The Designer [January 13, 2010 at 2:25 pm]. Further, in 2009, I used to leave comments of support on how this blog was a great resource for (a)theism. Although I think it’s taken a downward spiral since then, it still is a great resource for (a)theism, but that’s besides the point. Also, note the compliment was to Alonzo in this instance, not Luke. More, when I used to comment on Alonzo’s blog, I didn’t refrain from giving credit where I felt credit was due, either, for example, his excellent Willingness To Pay post. In fact, one of the reasons I make it a point to give credit where credit is due is because, in the world of blogging, it’s all-too-easy to give in to the temptation towards negativity and animosity. In blogging, as in life, I think we ought to find positive things to say about everyone, lest we end up believing in caricatures. This helps strike a balance.

To contrast, and, not that it matters, but, how often do you find Luke and/or Alonzo saying anything positive about me? It seems to me that just because I have questions and don’t always buy their answers, I get called names from both of them ["troll" and "sophist," respectively].

I find the selective focus odd, but then again, people often see what they want to see.

  (Quote)

Polymeron February 1, 2011 at 3:36 pm

cl,
Just to clarify, my comment was meant to (subtly) make the point that, when people are being civil to each other (such as by saying “nice post”), it is not appropriate to snark them (e.g. by suggesting that said civility is surprising coming from them). I was aware that this was not technically a compliment to Luke himself (though I think he does deserve some credit for hosting good posts on his blog).

  (Quote)

cl February 11, 2011 at 4:21 pm

Polymeron,

Just to clarify, my comment was meant to (subtly) make the point that, when people are being civil to each other (such as by saying “nice post”), it is not appropriate to snark them (e.g. by suggesting that said civility is surprising coming from them).

I got that, but, thanks for clarifying. I didn’t get offended at Luke’s quip. It’s okay that he returned my olive branch with salt. I simply wanted to set the record straight for those prone to cherrypicking.

  (Quote)

Leave a Comment