AI researcher Eliezer Yudkowsky is something of an expert at human rationality, and at teaching it to others. His hundreds of posts at Less Wrong are a treasure trove for those who want to improve their own rationality. As such, I’m reading all of them, chronologically.
I suspect some of my readers want to “level up” their rationality, too. So I’m keeping a diary of my Yudkowsky reading. Feel free to follow along.
In his 480th post, Yudkowsky finally begins to share with us his own “coming of age” story. My Childhood Death Spiral begins:
My parents always used to downplay the value of intelligence. And play up the value of – effort, as recommended by the latest research? No, not effort. Experience. A nicely unattainable hammer with which to smack down a bright young child, to be sure. That was what my parents told me when I questioned the Jewish religion, for example. I tried laying out an argument, and I was told something along the lines of: “Logic has limits, you’ll understand when you’re older that experience is the important thing, and then you’ll see the truth of Judaism.”
…The moral I derived when I was young, was that anyone who downplayed the value of intelligence didn’t understand intelligence at all. My own intelligence had affected every aspect of my life and mind and personality; that was massively obvious, seen at a backward glance. “Intelligence has nothing to do with wisdom or being a good person” – oh, and does self-awareness have nothing to do with wisdom, or being a good person? Modeling yourself takes intelligence. For one thing, it takes enough intelligence to learn evolutionary psychology.
…But the young Eliezer was a transhumanist. Giving away IQ points was going to take more work than if I’d just been born with extra money. But it was a fixable problem, to be faced up to squarely, and fixed. Even if it took my whole life. “The strong exist to serve the weak,” wrote the young Eliezer, “and can only discharge that duty by making others equally strong.” I was annoyed with the Randian and Nietszchean trends in SF, and as you may have grasped, the young Eliezer had a tendency to take things too far in the other direction. No one exists only to serve. But I tried, and I don’t regret that.
…And then Eliezer1996 encountered the concept of the Singularity. Was it a thunderbolt of revelation? Did I jump out of my chair and shout “Eurisko!”? Nah. I wasn’t that much of a drama queen. It was just massively obvious in retrospect that smarter-than-human intelligence was going to change the future more fundamentally than any mere material science. And I knew at once that this was what I would be doing with the rest of my life, creating the Singularity…
Was this a happy death spiral? As it turned out later, yes: that is, it led to the adoption even of false happy beliefs about intelligence. Perhaps you could draw the line at the point where I started believing that surely the lightspeed limit would be no barrier to superintelligence. (It’s not unthinkable, but I wouldn’t bet on it.)
But the real wrong turn came later, at the point where someone said, “Hey, how do you know that superintelligence will be moral? Intelligence has nothing to do with being a good person, you know – that’s what we call wisdom, young prodigy.”
What was the wrong turn? Eliezer thought that superintelligence would lead to supermorality. The story continues in My Best and Worst Mistake:
My youthful disbelief in a mathematics of general intelligence was simultaneously one of my all-time worst mistakes, and one of my all-time best mistakes.
Because I disbelieved that there could be any simple answers to intelligence, I went and I read up on cognitive psychology, functional neuroanatomy, computational neuroanatomy, evolutionary psychology, evolutionary biology, and more than one branch of Artificial Intelligence…
…When you blank out all the wrong conclusions and wrong justifications, and just ask what that belief led the young Eliezer to actually do…
Then the belief that Artificial Intelligence was sick and that the real answer would have to come from healthier fields outside, led him to study lots of cognitive sciences;
The belief that AI couldn’t have simple answers, led him to not stop prematurely on one brilliant idea, and to accumulate lots of information;
The belief that you didn’t want to define intelligence, led to a situation in which he studied the problem for a long time before, years later, he started to propose systematizations.
This is what I refer to when I say that this is one of my all-time best mistakes.
…So what makes this one of my all-time worst mistakes? Because sometimes “informal” is another way of saying “held to low standards”. I had amazing clever reasons why it was okay for me not to precisely define “intelligence”, and certain of my other terms as well: namely,other people had gone astray by trying to define it. This was a gate through which sloppy reasoning could enter.
And then back to Eliezer’s coming-of-age story with That Tiny Note of Discord:
When we last left Eliezer1997, he believed that any superintelligence would automatically do what was “right”, and indeed would understand that better than we could; even though, he modestly confessed, he did not understand the ultimate nature of morality. Or rather, after some debate had passed, Eliezer1997 had evolved an elaborate argument, which he fondly claimed to be “formal”, that we could always condition upon the belief that life has meaning; and so cases where superintelligences did not feel compelled to do anything in particular, would fall out of consideration…
So far, the young Eliezer is well on the way toward joining the “smart people who are stupid because they’re skilled at defending beliefs they arrived at for unskilled reasons”. All his dedication to “rationality” has not saved him from this mistake, and you might be tempted to conclude that it is useless to strive for rationality.
Most people do not crawl out of their own hold, but Eliezer did. And it all begin with a tiny note of discord:
And yet then the notion occurs to him:
Maybe some people would prefer an AI do particular things, such as not kill them, even if life is meaningless?
His immediately following thought is the obvious one, given his premises:
In the event that life is meaningless, nothing is the “right” thing to do; therefore it wouldn’t be particularly right to respect people’s preferences in this event.
This is the obvious dodge. The thing is, though, Eliezer2000 doesn’t think of himself as a villain…
So Eliezer2000 doesn’t just grab the obvious out. He keeps thinking.
But if people believe they have preferences in the event that life is meaningless, then they have a motive to dispute my Singularity project and go with a project that respects their wish in the event life is meaningless. This creates a present conflict of interest over the Singularity, and prevents right things from getting done in the mainline event that life is meaningful.
…The new objection seems to poke a minor hole in the airtight wrapper. This is worth patching. If you have something that’s perfect, are you really going to let one little possibility compromise it?
So Eliezer2000 doesn’t even want to drop the issue; he wants to patch the problem and restore perfection.
…And so Eliezer2000 begins to really consider the question: Supposing that “life is meaningless” (that superintelligences don’t produce their own motivations from pure logic), then how would you go about specifying afallback morality? Synthesizing it, inscribing it into the AI?
…That’s the only thing that matters, in the end. His previous philosophizing wasn’t enough to force his brain to confront the details. This new standard is strict enough to require actual work. Morality slowly starts being less mysterious to him – Eliezer2000 is starting to think inside the black box.
……and so, over succeeding years, understanding begins to dawn on that past Eliezer, slowly.