In episode 16 of Morality in the Real World, Alonzo Fyfe and I discuss reward-based theories of desire.
Every five episodes we answer audience questions, so please do post your questions and objections below. Make sure your questions address the topics of this episode only. If we plan to address the subject of your question in later episode, we will not answer it in the next Q & A episode. You can also leave your question in audio and we will play it back during the Q & A episode and respond to it: call 413-723-0175 and press 1 to leave a voicemail.
Transcript of episode 16:
ALONZO: Good morning, Luke. Are you ready?
LUKE: No. Sleepy.
ALONZO: Well, wake up. We have work to do.
ALONZO: In the last session you said something about associating desire to a third system in the brain. We have looked at associating it with motivation – the system we use. We looked at associating it with pleasure. Then, in the last episode, you spoke about associating desire with the reward-learning system.
LUKE: Well, not me, actually. Tim Schroeder proposed that theory. He’s the guy who wrote the article on the subject in the Stanford Encyclopedia of Philosophy and the book Three Faces of Desire.
ALONZO: Well, then, what is this theory?
LUKE: Okay. In the Stanford Encyclopedia of Philosophy article, Tim Schroeder expressed it this way:
For an organism to desire p is for it to use representations of p to drive reward-based learning.
There’s a lot to explain about that, but I think a good way to think about this is to think about teaching a dog a trick. You have the dog do the trick and then you reward the dog. At first, the dog performs the trick because the dog wants the reward. But, sooner or later, the dog is performing the trick without any reward. The same applies with house-breaking a pet – punishing the pet in order to stop certain behaviors.
Now, the reward has to be something that the dog wants. Or, in the case of punishment, the punishment has to be something that the dog wants to avoid. So, ultimately, a reward-based theory of desire says that to desire something is to have it serve as a reward, and to be averse to something is to have it serve as a punishment.
ALONZO: So, our desires are like the rewards and punishments used to house-break a pet?
LUKE: Well, you know how you have always talked about how rewards and punishments are methods that we use to mold the desires of others – that some desires are malleable and, with rewards such as praise and punishments such as condemnation, we can strengthen some desires and inhibit others?
ALONZO: I probably said something like that once or twice.
LUKE: Per day, maybe. Well, Schroeder gives a short description of how this system works in the brain:
When an organism like us is rewarded by, say, being given a bicycle, the first thing that happens in its brain is that it represents being given a bicycle. This representation causes activity elsewhere in the brain, that categorizes the represented event as a reward. Meanwhile, other brain structures have been attempting to predict the rewards and punishments the organism was going to receive at this moment. The combination of current reward information and predicted reward information is used by the brain to calculate the difference between the rewards that had previously been predicted and the rewards that have actually materialized. The result is released to the rest of the brain in the form of a very specific signal, one causing a very specific form of learning. This signal has effects upon the short-term operation of the brain and upon the long-term dispositions; effects that, in organisms like us, affect our feelings and modify our dispositions to act, think and experience, all in ways that tend to increase the acquisition of rewards and the avoidance of punishments.
ALONZO: Meaning, that if we get something we want – such as a bicycle, or an award, or praise – this causes a set of changes in the brain that, in the end, tends to cause us to be more strongly disposed to act in those ways that brought about the reward.
Similarly, when we experience something we do not want – such as an electric shock, nausea, or condemnation – this causes a change in our brain that disposes us to be less strongly disposed to act in ways that bring about that which we don’t like.
LUKE: Right, that’s what Schroeder is saying.
ALONZO: I want to call attention to the fact that Schroeder’s “very specific form of learning” isn’t about learning facts – like, learning the capitals of all the states. It is learning, as Schroeder says, that alters our long-term dispositions.
LUKE: Right. And actually, there’s something about the reward learning system that might not fit with what you’ve been saying before, Alonzo. Research shows that a fully predicted reward has no effect on learning. Here, let me quote the Scholarpedia article on reward signals:
The response to reward appears to code the discrepancy between the reward and [predicted reward]… such that an unpredicted reward elicits an activation… a fully predicted reward elicits no response, and the omission of a predicted reward induces a depression…
Now, Alonzo, I don’t think you have accounted for that fact anywhere in your theory so far. You have rewards and punishments producing changes in behavior, but I’ve never heard you say anything about the actual reward having to be different from the predicted reward in order to produce such a change in our dispositions to act, as encoded in the brain.
ALONZO: Um . . . no . . . I haven’t.
LUKE: Well, then, there’s something new for you to consider!
ALONZO: I am wondering if we can pause for just a moment. I’m beginning to sense that we are using ‘reward’ and ‘punishment’ in two different ways, and I’m starting to get concerned that this is going to trip us up.
ALONZO: Typically, when people talk about ‘reward’ and ‘punishment’, this assumes some sort of approval and disapproval. A reward in this sense might be an award for something done well. A punishment might be something like being sent to jail.
But reward and punishment in the biological sense doesn’t come with these ideas attached. A biological reward – getting food, for example – doesn’t contain any sense of approval. A biological punishment – an electric shock – doesn’t imply disapproval.
LUKE: Okay, but the way most people talk and the way scientists talk – about reward and punishment – do seem somewhat related.
ALONZO: Well, actually, I would argue that what we do is take what scientists call rewards and punishments and simply add this element of approval and disapproval.
A full discussion of that would take us off track, but I think it’s important to note that when we talk about reward and punishment here, we are talking about a sense that does not include expressions of approval or disapproval. These are biological rewards and punishments.
LUKE: Okay, fair enough. Now that we’ve gone through some of the basics of the reward-learning system in the brain, we can talk about Schroeder’s reward-based theory of desire.
Remember, Schroeder claimed that when we say that an agent desires that P, that means that P serves as a reward for that agent in reward-based learning.
ALONZO: Let me get this straight. You’re not just saying that what is desired also tends to serve as a reward in reward-based learning. Instead, what is desired and what serves as a reward in reward-based learning are the same things.
LUKE: That’s right. Remember what Schroeder said:
For an organism to desire p is for it to use representations of p to drive reward-based learning.
ALONZO: I disagree .
ALONZO: . . . but I am not going to argue over definitions.
LUKE: Oh, good.
ALONZO: Still, we should go through his theory so that we can figure out how to translate between his language and ours, and to see whether we disagree on any of the facts of the matter.
LUKE: Sounds good. Now, we already found one point of disagreement. You didn’t have a place in your theory for fully predicted rewards having no effect.
ALONZO: We don’t disagree. I was wrong. I took a glance at the research and saw the opinions of experts in the field and . . . well, my previous assumptions did not match our current scientific understanding of the subject. It’s interesting. I’m going to have to give some thought to what that implies.
But, let’s find out what else Schroeder says about desires and see if we can find anything else… interesting.
LUKE: Luckily, Schroeder was kind enough to provide us with a list of propositions for his theory, much like we did in the first episode of this season. So let’s look through them and see where you and he might disagree, and why. Here is Schroeder’s first claim:
Claim 1: Intrinsic desires, wants, and wishes form a natural grouping, closely related to one another but distinguished from other pro attitudes such as trying or intending.
ALONZO: Okay, we’re trying to get past language issues, so – instead of arguing definitions – let’s just figure out how to translate between what Schroeder is saying and what we are saying.
What Schroeder calls “intrinsic desires”, we call “desires as ends” – to distinguish them from what an agent desires as a means. We discussed Schroeder’s claims about ‘trying’ in Episode 12, where I argued that “to try” means “to succeed if one can” – which is an action-based definition.
As for “intending”, belief-desire theory holds that beliefs and desires-as-ends either constitute or create intentions. Either way, intentions will end up being distinct from desires-as-ends by themselves.
So, I think I agree with this claim when it is translated to the way I’m using these words, which is fine.
LUKE: Okay, let’s continue.
Claim 2: Intrinsic desires may be distinguished from instrumental ones.
ALONZO: Desiring something as an end is not the same as desiring something as a means to an end. And we see both types of desires working in the real world.
LUKE: Right. Okay, Schroeder’s third claim.
Claim 3: All desires are desires that P (‘P’ standing for some proposition).
ALONZO: Yes. I think that’s true. That’s our claim that desires, like beliefs, are propositional attitudes.
LUKE: So, no real disagreements yet?
ALONZO: Nope. Only differences in language.
LUKE: Alright, let’s keep going.
Claim 4: One can, in principle, intrinsically desire that P for any state of affairs P one can perceptually or cognitively represent.
ALONZO: Yeah. The way I have always said it is that the range of propositions that an agent can seek to make or keep true is about equal to the range of propositions an agent can believe. At least there is no prima facie reason to believe that they are significantly different. An agent who can believe that the moon Pandora can continue to exist, can desire that the moon Pandora continue to exist.
LUKE: Next one:
Claim 5: A thing is a reward or punishment only if it is wanted or unwanted… respectively, by the recipient.
ALONZO: Hmmm. Something that an agent does not want cannot serve as a reward. And something that an agent does not want to avoid cannot serve as a punishment.
That seems right, so far.
Claim 6: To be a desire is to be a representational capacity contributing to a reward or punishment signal.
ALONZO: Ah ha! Yep. Here is where he is going to define desire in terms of reward and punishment.
And here is where we are supposed to enter into a huge discussion bringing in all sorts of conceptual analysis examining how we intuitively apply our concepts to a range of bizarre scenarios in order to show that the best super-dictionary definition of the term either does or does not fit this claim.
In fact, to be honest, in the original script for this episode, I wrote about 10 minutes of material where we did just that. Now it’s gone.
LUKE: Oh good!
ALONZO: I agree that a reward system exists. There are certain things – which we have grown accustomed to calling rewards and punishments – that trigger changes in a person’s dispositions to behave. I agree that rewards are generally wanted and punishments are generally unwanted.
I’m not inclined to say that a desire has to be a part of a reward learning system. I’m willing to say that a creature that doesn’t even have a reward learning system can still have desires.
But we do have a reward learning system where things promote or inhibit dispositions to act. These rewards and punishments that trigger changes in dispositions exist, and Schroeder can call them “desires” and “aversions” if he wants to.
LUKE: So, no substantive disagreement, then?
ALONZO: Not really. Not if Schroeder is willing to admit that a motivational state does not have to be a part of a reward learning system. If so, we seem to just be using different definitions for the word ‘desire.’
LUKE: Okay. I suspect that you will have the same reaction to Claim 7.
Claim 7: To be a desire is to be a representational capacity contributing to a certainly mathematically describable form of learning.
ALONZO: Does he actually say, “certainly mathematically describable form of learning”?
LUKE: Um… Yes.
ALONZO: Well, I think he means that the mathematical description is certain or fixed. But, I don’t think we need to worry about that detail.
LUKE: Yeah, that was a little confusing to me, too, but… okay. Good.
ALONZO: Now, if we replace the symbol with the substance, I see no reason to reject the claim that our reward-learning system involves some “representational capacity contributing to a certainly mathematically describable form of learning.” I would not give it the name “desire” – but we are not going to argue about definitions, only about what is real, and this representational capacity is real.
For me, talking about motivational states, all I need in order to suggest that you have an aversion to heights, for example, is your reluctance to choose to approach a ledge. Again, motivational states do not require a reward learning system. But, I don’t think that Schroeder would disagree with that – so no disagreement in substance.
Claim 8: Desires are realized in human beings and other animals like us by the biological reward system, centered around the dopamine-releasing neurons of the SNpc and VTA.
ALONZO: Um . . . okay . . . what’s that?
LUKE: Right, so, there are some neurons in these two parts of the brain called the substantia nigra pars compacta and the ventral tegmental area that release the chemical dopamine to other structures in the brain like the striatum and the prefrontal cortex. For our purposes right now, it’s not that important to know where those parts of the brain are, but it is important to understand what the dopmaine does when it is released.
Dopamine changes the sensitivity of certain neurons to future stimulation in ways that change our dispositions to act.
ALONZO: Okay. I’m not a neuroscientist, but I don’t have any reason to disagree with any of those things. The only difference is that I would apply the term “desire” to those dispositions to act that are being changed, not to the rewards and punishments that are causing those changes.
LUKE: Okay, let’s move on to Schroeder’s Claim 9.
Claim 9: Desiring that P is what makes it possible for people to learn certain sorts of habits and to have certain sorts of modifications to their sensory capacities, and probably what makes it possible for them to undergo other, less well studied long-term psychological changes.
ALONZO: Okay, one of the things that I notice, is that Schroeder and I seem to be attaching the word ‘desire’ to different ends of the reward-learning system.
Schroeder wants to apply the term “desire” to the agents of change – the things that cause differences in our dispositional states. I typically attach the word ‘desire’ to the things that are changed – the dispositional motivational states that the agents of change can effect.
We are not disagreeing over the fact that these agents of change exist. Nor do we disagree over the fact that there are things that these agents of change act upon and . . . well . . . change. So, there is still no significant disagreement over the facts.
I am going to have to add that the things acted upon by these agents of change in the reward-learning system – also provide the rewards and punishments. It’s not a straight line from reward to desire. It’s a feedback loop. Which is where we get Schroeder’s claim that something has to be wanted to serve as a reward, and unwanted to serve as a punishment.
Claim 10: Pleasure and displeasure are representations of net positive and negative (respectively) change (relative to expectations) in desire satisfaction.
ALONZO: On this one, I don’t know.
Schroeder is saying that pleasure itself is the measure of the positive difference between expected rewards and actual rewards, and displeasure is the measure of the negative difference between the two.
I saw the evidence that supported Schroeder’s claim that only unexpected results have an effect on learning and I have accepted that. I don’t see the evidence to support this claim, however. Schroeder’s own book offers anecdotal evidence supporting this claim, but I haven’t found anything resembling a controlled scientific experiment that draws this conclusion.
I’m not saying that I disagree. I am saying I don’t know.
But, as it turns out, it won’t affect desirism one way or the other, so it is not going to be important to our project.
LUKE: Right. Also, I would say that the very recent evidence we presented in the last episode about how ‘wanting’ and ‘liking’ are encoded by slightly divergent neural pathways in the brain may suggest that Schroeder’s claim here is false, but I’m not sure. Anyway, let’s go on to Claim 11.
Claim 11: People are typically moved to action, or inhibited from acting, because there is something they want to achieve [or avoid], and they see a way of achieving [or avoiding] it.
ALONZO: Well, yeah. Actually, I would say “always” motivated to act by a desire or aversion. But, again, Schroeder probably doesn’t want to say “always” because an agent of change in learning need not always motivate an agent to act. And I would agree with what Schroeder says about these agents of change.
Again, we’re just using the word ‘desire’ a bit differently, but our agreement or disagreement on matters of fact becomes clear when we replace the symbol ‘desire’ with the substance of what each of us is talking about. For Schroeder, it’s the agents of change in reward-based learning. For me, it’s the dispositions to act that are changed by those reward signals.
Claim 12: Desires can cause one to form prior intentions, and cause one to try.
ALONZO: That’s true.
Schroeder says that desires can cause one to form prior intentions.
I have already said that there is an overlap between what Schroeder calls desires and what I call desires – the dispositional states acted on by rewards and punishments also serve as rewards and punishments. So . . . yeah . . . because of this overlap, desires in both senses can cause one to form prior intentions.
LUKE: Okay, next.
Claim 13: Desires are not the only causes of goal-directed movement, but of all the pro attitudes, they are the most fundamental causes.
ALONZO: Okay. Again, we have to keep in mind that Schroeder is saying that the agents of change in a reward-learning system are not the only causes – but are the most fundamental causes – of goal-directed movement.
I would say that desires are always the cause of goal-directed movement because that’s how I’m using the term ‘desire’ – to name those things that direct movement.
However, I do not disagree with what Schroeder says about the most fundamental causes of goal-directed movement being found in those agents of change.
LUKE: Okay, let’s move on to Schroeder’s 14th claim.
Claim 14: A human being with no desires is incapable of normal goal-directed movement.
ALONZO: I would accept this, of course. The way I use the term “desire’ there is no motivation without desire.
Schroeder is saying that there is no normal goal-directed movement without these agents of change. I’m not sure about that, but it’s not going to be important.
Claim 15: Moral thinking has no special power to move us except insofar as we have desires regarding morality.
ALONZO: Yep, that one is true, too.
Claim 16: There are no fleeting desires.
ALONZO: You know, my whole approach to Schroeder’s claims changed once we cleared up that part about definitions two episodes ago.
When we first wrote this episode, I looked at something like Claim 16 and took the word “desires” to have its on-the-street sense, and asked if it would make sense given the way people would usually use the term.
Now, the first thing I do is I remind myself, “Okay, we’re taking about Schroederian desires, which are the agents of change in a reward-learning system. This means, he is saying that there are no fleeting agents of change in a reward-learning system. Is that true or false?”
I don’t see why it has to be true of Schroederian desires. I see no reason to think that a brain structure that instantiates a particular reward can’t come into being for a day, or a week, and then disappear again. That doesn’t seem impossible – though it may be unlikely.
Anyway, then I ask myself whether the same statement is true using the concept of desires defined in my way – as motivational states. Can there be fleeting motivational-state desires?
Yeah, that seems possible as well. Why can’t the brain structure for a certain motivational state come into existence and then disappear again? The brain is always changing. That is how we store facts and memories. Learning itself requires changes to the brain. In this constantly changing brain, it seems possible – though certainly not common – for a particular motivational reason to come into existence, then fade away rather quickly.
LUKE: But, Alonzo, one of the important facts about desirism is that desires are persistent entities. The desires that you have that will be responsible for your choices in a given situation are desires that will stick with you across a range of circumstances. This persistence is important in knowing which desires to promote and which to inhibit.
ALONZO: Oh, certainly. Generally, desires are persistent. They tend to stick around for a long time. But this doesn’t mean that fleeting desires are impossible, only that they are not the norm for desires.
LUKE: Okay. That’s it, then. That was Schroeder’s last claim. It seems that there’s not much disagreement between you and Schroeder on the facts, it’s mostly just that you’re using certain words to mean slightly different things.
ALONZO: Right. I think I disagree on some minor issues about “trying” and the possibility of fleeting desires – both as agents of change or motivational states. But nothing significant.
Now, I could go into 60-minutes of conceptual analysis to show that Schroeder’s use of the word ‘desire’ does not fit our best super-dictionary definition of the term…..
LUKE: But you are not going to do that, right?
ALONZO: But, I am not going to.
LUKE: Good. Now, in our next episode, you want to discuss good-based theories of desire. These are theories that say that for an agent to desire something is for that agent to believe that it is good.
ALONZO: Yeah. It’s something I hear a lot – usually from people who are trying to save the idea of some sort of intrinsic value. But that will be for next time.
LUKE: Okay! See you next time.
(in order of appearance)
- “Hour Five” from Somnium by Robert Rich
- “Psalm 4″ from Of Psalms by Date Palms
- “The Chairman Dances” from Harmonielehre / Chairman Dances / Short Ride in a Fast Machine by John Adams
- “String Quartet No. 3, Mishima V: Blood Oath” from String Quartets Nos. 1-4 by Philip Glass
- “Blue Bossanova” from Hey Sugar by Bossanova
* marks royalty-free music. With copyrighted music, we use only short clips and hope this qualifies as Fair Use. Fair Use is defined in the courts, but please note that we make no profit from this podcast, and we hope to bring profit to the copyright owners by linking listeners to somewhere they can purchase the music. If you are a copyright owner and have a complaint, please contact us and we will respond immediately. The text and the recordings of Luke and Alonzo for this podcast are licensed with Creative Commons license Attribution-Noncommercial-Share Alike 3.0, which means you are welcome to republish or remix this work as long as you (1) cite the original source, and (2) share your remix using the same license, and (3) do not use it for commercial purposes.