AI researcher Eliezer Yudkowsky is something of an expert at human rationality, and at teaching it to others. His hundreds of posts at Less Wrong are a treasure trove for those who want to improve their own rationality. As such, I’m reading all of them, chronologically.
I suspect some of my readers want to “level up” their rationality, too. So I’m keeping a diary of my Yudkowsky reading. Feel free to follow along.
Up until that point, I’d never quite admitted to myself that Eliezer1997‘s AI goal system design would definitely, no two ways about it, pointlessly wipe out the human species. Now, however, I looked back, and I could finally seewhat my old design really did, to the extent it was coherent enough to be talked about. Roughly, it would have converted its future light cone into generic tools – computers without programs to run, stored energy without a use…
…how on Earth had I, the fine and practiced rationalist, how on Earth had I managed to miss something that obvious, for six damned years?
That was the point at which I awoke clear-headed, and remembered; and thought, with a certain amount of embarrassment: I’ve been stupid.
I understood that you could do everything that you were supposed to do, and Nature was still allowed to kill you. That was when my last trust broke. And that was when my training as a rationalist began.
No! Try not! Do, or do not. There is no try.
Today’s post is a tad gloomier than usual, as I measure such things. It deals with a thought experiment I invented to smash my own optimism, after I realized that optimism had misled me. Those readers sympathetic to arguments like, “It’s important to keep our biases because they help us stay happy,” should consider not reading. (Unless they have something to protect, including their own life.)
Which leads, finally, to: My Bayesian Enlightenment. Eliezer’s Bayesian enlightenment did not occur when he learned Bayes’ Rule, nor when he discovered the study of human heuristics and biases, but when he read Probability Theory: The Logic of Science:
…it was PT:TLOS that did the trick. Here was probability theory, laid out not as a clever tool, but as The Rules, inviolable on pain of paradox. If you tried to approximate The Rules because they were too computationally expensive to use directly, then, no matter how necessary that compromise might be, you would still end doing less than optimal. Jaynes would do his calculations different ways to show that the same answer always arose when you used legitimate methods; and he would display different answers that others had arrived at, and trace down the illegitimate step. Paradoxes could not coexist with his precision. Not an answer, but the answer.
And the story concludes:
If there’s one thing I’ve learned from this history, it’s that saying “Oops” is something to look forward to. Sure, the prospect of saying “Oops” in the future, means that the you of right now is a drooling imbecile, whose words your future self won’t be able to read because of all the wincing. But saying “Oops” in the future also means that, in the future, you’ll acquire new Jedi powers that your present self doesn’t dream exist. It makes you feel embarrassed, but also alive. Realizing that your younger self was a complete moron means that even though you’re already in your twenties, you haven’t yet gone over your peak. So here’s to hoping that my future self realizes I’m a drooling imbecile: I may plan to solve my problems with my present abilities, but extra Jedi powers sure would come in handy.
That scream of horror and embarrassment is the sound that rationalists make when they level up. Sometimes I worry that I’m not leveling up as fast as I used to, and I don’t know if it’s because I’m finally getting the hang of things, or because the neurons in my brain are slowly dying.