Reading Yudkowsky, part 28

by Luke Muehlhauser on April 12, 2011 in Eliezer Yudkowsky,Resources,Reviews

AI researcher Eliezer Yudkowsky is something of an expert at human rationality, and at teaching it to others. His hundreds of posts at Less Wrong are a treasure trove for those who want to improve their own rationality. As such, I’m reading all of them, chronologically.

I suspect some of my readers want to “level up” their rationality, too. So I’m keeping a diary of my Yudkowsky reading. Feel free to follow along.

His 217th post is Absolute Authority:

The one comes to you and loftily says:  “Science doesn’t really know anything.  All you have are theories – you can’t know for certain that you’re right.  You scientists changed your minds about how gravity works – who’s to say that tomorrow you won’t change your minds about evolution?”

Behold the abyssal cultural gap.  If you think you can cross it in a few sentences, you are bound to be sorely disappointed.

In the world of the unenlightened ones, there is authority and un-authority.  What can be trusted, can be trusted; what cannot be trusted, you may as well throw away.  There are good sources of information and bad sources of information.  If scientists have changed their stories ever in their history, then science cannot be a true Authority, and can never again be trusted – like a witness caught in a contradiction, or like an employee found stealing from the till.

Plus, the one takes for granted that a proponent of an idea is expected to defend it against every possible counterargument and confess nothing.  All claims are discounted accordingly.  If even the proponent of science admits that science is less than perfect, why, it must be pretty much worthless.

When someone has lived their life accustomed to certainty, you can’t just say to them, “Science is probabilistic, just like all other knowledge.”  They will accept the first half of the statement as a confession of guilt; and dismiss the second half as a flailing attempt to accuse everyone else to avoid judgment.

What can you say to reach such a person? Eliezer offers some suggestions.

  • “The power of science comes from having the ability to change our minds and admit we’re wrong.  If you’ve never admitted you’re wrong, it doesn’t mean you’ve made fewer mistakes.”
  • “Anyone can say they’re absolutely certain.  It’s a bit harder to never, ever make any mistakes.  Scientists understand the difference, so they don’t say they’re absolutely certain.  That’s all.  It doesn’t mean that they have any specific reason to doubt a theory – absolutely every scrap of evidence can be going the same way, all the stars and planets lined up like dominos in support of a single hypothesis, and the scientists still won’t say they’re absolutely sure, because they’ve just got higher standards.  It doesn’t mean scientists are less entitled to certainty than, say, the politicians who always seem so sure of everything.”
  • “Scientists don’t use the phrase ‘not absolutely certain’ the way you’re used to from regular conversation.  I mean, suppose you went to the doctor, and got a blood test, and the doctor came back and said, ‘We ran some tests, and it’s not absolutely certain that you’re not made out of cheese, and there’s a non-zero chance that twenty fairies made out of sentient chocolate are singing the ‘I love you’ song from Barney inside your lower intestine.’  Run for the hills, your doctor needs a doctor.  When a scientist says the same thing, it means that he thinks the probability is so tiny that you couldn’t see it with an electron microscope, but he’s willing to see the evidence in the extremely unlikely event that you have it.”
  • “Would you be willing to change your mind about the things you call ‘certain’ if you saw enough evidence?  I mean, suppose that God himself descended from the clouds and told you that your whole religion was true except for the Virgin Birth.  If that would change your mind, you can’t say you’re absolutely certain of the Virgin Birth.  For technical reasons of probability theory, if it’s theoretically possible for you to change your mind about something, it can’t have a probability exactly equal to one.  The uncertainty might be smaller than a dust speck, but it has to be there.  And if you wouldn’t change your mind even if God told you otherwise, then you have a problem with refusing to admit you’re wrong that transcends anything a mortal like me can say to you, I guess.”

Infinite Certainty asks: Can we have absolute certainty of mathematical truths?

We must distinguish between the the map and the territory.  Given the seeming absolute stability and universality of physical laws, it’s possible that never, in the whole history of the universe, has any particle exceeded the local lightspeed limit.  That is, the lightspeed limit may be, not just true 99% of the time, or 99.9999% of the time, or (1 – 1/googolplex) of the time, but simply always and absolutely true.

But whether we can ever have absolute confidence in the lightspeed limit is a whole ‘nother question.  The map is not the territory.

It may be entirely and wholly true that a student plagiarized their assignment, but whether you have any knowledge of this fact at all – let alone absolute confidence in the belief – is a separate issue.  If you flip a coin and then don’t look at it, it may be completely true that the coin is showing heads, and you may be completely unsure of whether the coin is showing heads or tails.  A degree of uncertainty is not the same as a degree of truth or a frequency of occurrence.

The same holds for mathematical truths.  It’s questionable whether the statement “2 + 2 = 4″ or “In Peano arithmetic, SS0 + SS0 = SSSS0″ can be said to be true in any purely abstract sense, apart from physical systems that seem to behave in ways similar to the Peano axioms.  Having said this, I will charge right ahead and guess that, in whatever sense “2 + 2 = 4″ is true at all, it is always and precisely true, not just roughly true (“2 + 2 actually equals 4.0000004″) or true 999,999,999,999 times out of 1,000,000,000,000.

I’m not totally sure what “true” should mean in this case, but I stand by my guess.  The credibility of “2 + 2 = 4 is always true” far exceeds the credibility of any particular philosophical position on what “true”, “always”, or “is” means in the statement above.

This doesn’t mean, though, that I have absolute confidence that 2 + 2 = 4.  See the previous discussion on how to convince me that 2 + 2 = 3, which could be done using much the same sort of evidence that convinced me that 2 + 2 = 4 in the first place.  I could have hallucinated all that previous evidence, or I could be misremembering it.  In the annals of neurology there are stranger brain dysfunctions than this.

Suppose you say that you’re 99.99% confident that 2 + 2 = 4.  Then you have just asserted that you could make 10,000 independent statements, in which you repose equal confidence, and be wrong, on average, around once.  Maybe for 2 + 2 = 4 this extraordinary degree of confidence would be possible: “2 + 2 = 4″ extremely simple, and mathematical as well as empirical, and widely believed socially (not with passionate affirmation but just quietly taken for granted).  So maybe you really could get up to 99.99% confidence on this one.

I don’t think you could get up to 99.99% confidence for assertions like “53 is a prime number”.  Yes, it seems likely, but by the time you tried to set up protocols that would let you assert 10,000 independent statements of this sort – that is, not just a set of statements about prime numbers, but a new protocol each time – you would fail more than once…

As for the notion that you could get up to 100% confidence in a mathematical proposition – well, really now!  If you say 99.9999% confidence, you’re implying that you could make one million equally fraught statements, one after the other, and be wrong, on average, about once.  That’s around a solid year’s worth of talking, if you can make one assertion every 20 seconds and you talk for 16 hours a day.

Assert 99.9999999999% confidence, and you’re taking it up to a trillion.  Now you’re going to talk for a hundred human lifetimes, and not be wrong even once?

Also see 0 and 1 Are Not Probabilities. Beautiful Math makes a short point about math, followed up by Expecting Beauty:

Looking for mathematical beauty you haven’t found yet, is not so sure as expecting the Sun to rise in the east.  But neither does it seem like the same shade of uncertainty as expecting a purple polka-dot fairy – not after you ponderthe last fifty-seven thousand cases where humanity found hidden order.

And yet in mathematics the premises and axioms are closed systems – can we expect the messy real world to reveal hidden beauty?

The answer comes in Is Reality Ugly?

Beneath the complex forms and shapes of the surface world, there is a simple level, an exact and stable level, whose laws we name “physics”.  This discovery, the Great Surprise, has already taken place at our point in human history – but it does not do to forget that it was surprising.  Once upon a time, people went in search of underlying beauty, with no guarantee of finding it; and once upon a time, they found it; and now it is a known thing, and taken for granted.

Then why can’t we predict the location of every tiger in the bushes as easily as we predict the sixth cube?

I count three sources of uncertainty even within worlds of pure math – two obvious sources, and one not so obvious.

The first source of uncertainty is that even a creature of pure math, living embedded in a world of pure math, may not know the math.   Humans walked the Earth long before Galileo/Newton/Einstein discovered the law of gravity that prevents us from being flung off into space.  You can be governed by stable fundamental rules without knowing them.  There is no law of physics which says that laws of physics must be explicitly represented, as knowledge, in brains that run under them.

…The second obvious source of uncertainty is that even when you know all the relevant laws of physics, you may not have enough computing power to extrapolate them.

…The third source of uncertainty is… [that] to figure out what the night sky should look like, it’s not enough to know the laws of physics.  It’s not even enough to have logical omniscience over their consequences.  You have to know where you are in the universe.  You have to know that you’re looking up at the night sky from Earth. The information required is not just the information to locate Earth in the visible universe, but in the entire universe, including all the parts that our telescopes can’t see because they are too distant…

But uncertainty exists in the map, not in the territory.  If we are ignorant of a phenomenon, that is a fact about our state of mind, not a fact about the phenomenon itself.  Empirical uncertainty, logical uncertainty, and indexical uncertainty are just names for our own bewilderment.  The best current guess is that the world is math and the math is perfectly regular.  The messiness is only in the eye of the beholder.

Beautiful Probability, then, asks:

Should we expect rationality to be, on some level, simple?  Should we search and hope for underlying beauty in the arts of belief and choice?

Bayesians like Yudkowsky see probability theory as a self-consistent set of theorems, but Old School rationalists use a variety of not-necessarily-consistent tools to get at the truth. Isn’t that required by the messiness of life?

…should rationality be math?  It is by no means a foregone conclusion that probability should be pretty.  The real world is messy – so shouldn’t you need messy reasoning to handle it?  Maybe the non-Bayesian statisticians, with their vast collection of ad-hoc methods and ad-hoc justifications, are strictly more competent because they have a strictly larger toolbox.  It’s nice when problems are clean, but they usually aren’t, and you have to live with that

After all, it’s a well-known fact that you can’t use Bayesian methods on many problems because the Bayesian calculation is computationally intractable.  So why not let many flowers bloom?  Why not have more than one tool in your toolbox?

That’s the fundamental difference in mindset.  Old School statisticians thought in terms of tools, tricks to throw at particular problems.  Bayesians – at least this Bayesian, though I don’t think I’m speaking only for myself – we think in terms of laws…

No, you can’t always do the exact Bayesian calculation for a problem.  Sometimes you must seek an approximation; often, indeed.  This doesn’t mean that probability theory has ceased to apply, any more than your inability to calculate the aerodynamics of a 747 on an atom-by-atom basis implies that the 747 is not made out of atoms.  Whatever approximation you use, it works to the extent that it approximates the ideal Bayesian calculation – and fails to the extent that it departs.

Bayesianism’s coherence and uniqueness proofs cut both ways.  Just as any calculation that obeys Cox’s coherency axioms (or any of the many reformulations and generalizations) must map onto probabilities, so too, anything that is not Bayesian must fail one of the coherency tests.  This, in turn, opens you to punishments like Dutch-booking (accepting combinations of bets that are sure losses, or rejecting combinations of bets that are sure gains).

You may not be able to compute the optimal answer.  But whatever approximation you use, both its failures and successes will be explainable in terms of Bayesian probability theory.  You may not know the explanation; that does not mean no explanation exists.

We aren’t enchanted by Bayesian methods merely because they’re beautiful.  The beauty is a side effect.  Bayesian theorems are elegant, coherent, optimal, and provably unique because they are laws.

The next post discusses Trust in Math – or, dare I say, faith in math?

Previous post:

Next post:

{ 10 comments… read them below or add one }

Esteban R. (Formerly Steven R.) April 12, 2011 at 3:42 pm

I think Yudkowsky can add another good rhetorical retort to his “Absolute Authority” article:

“The only thing that has replaced science is new, more accurate information that improves upon our previous knowledge and better explains the new information. In other words, The only thing that has ever outdone and disproved science is even more advanced science. Therefore, we conclude that God (or any other unquestionable authority) plays no logical, rational or imminent role in the development and education of human beings and can be successfully discarded as a needless idea that hinders more practical reasoning.”

  (Quote)

Rufus April 12, 2011 at 10:11 pm

“The same holds for mathematical truths. It’s questionable whether the statement “2 + 2 = 4″ or “In Peano arithmetic, SS0 + SS0 = SSSS0″ can be said to be true in any purely abstract sense, apart from physical systems that seem to behave in ways similar to the Peano axioms. Having said this, I will charge right ahead and guess that, in whatever sense “2 + 2 = 4″ is true at all, it is always and precisely true, not just roughly true (“2 + 2 actually equals 4.0000004″) or true 999,999,999,999 times out of 1,000,000,000,000″

It seems that Yudkowsky is suggesting that arithmetic equations can only be known to be true a posteriori. This is a somewhat controversial claim, no? What is Yudkowsky’s argument for arithmetic knowledge being an instance of a posteriori knowledge?

Here is my problem: if it is possible for a future experience to discount or disprove something like 2+2=4, then how could we possibly make sense of saying that you know 2+2=4 with 99% confidence. If such a calculation were available to provide epistemic confidence as a percentage, certainly this calculation could stand or fall based on more fundamental presumptions in arithmetic. In other words, “99% confidence” may only be intelligible within the framework in which 2+2=4. I suspect that his entire probabilistic calculus would be transformed by such counter-examples in which 2+2=5, or -45, or pi.

In other words, he would have to say it is 99% likely that “99% likelihood” is actually some value other than we currently understand it to be. I am not sure how to make sense of something like that. There seems to be some sort of self-referential paradox that creeps in if we base our certitude in mathematical knowledge on Bayesian-like analyses.

Perhaps I have misunderstood EY. I hope I am expressing my problem clearly, but I would be interested in what others think about this.

Best,

Rufus

  (Quote)

Esteban R. (Formerly Steven R.) April 12, 2011 at 11:25 pm

“The same holds for mathematical truths.It’s questionable whether the statement “2 + 2 = 4″ or “In Peano arithmetic, SS0 + SS0 = SSSS0″ can be said to be true in any purely abstract sense, apart from physical systems that seem to behave in ways similar to the Peano axioms.Having said this, I will charge right ahead and guess that, in whatever sense “2 + 2 = 4″ is true at all, it is always and precisely true, not just roughly true (“2 + 2 actually equals 4.0000004″) or true 999,999,999,999 times out of 1,000,000,000,000″

It seems that Yudkowsky is suggesting that arithmetic equations can only be known to be true a posteriori. This is a somewhat controversial claim, no? What is Yudkowsky’s argument for arithmetic knowledge being an instance of a posteriori knowledge?

Here is my problem: if it is possible for a future experience to discount or disprove something like 2+2=4, then how could we possibly make sense of saying that you know 2+2=4 with 99% confidence.If such a calculation were available to provide epistemic confidence as a percentage, certainly this calculation could stand or fall based on more fundamental presumptions in arithmetic.In other words, “99% confidence” may only be intelligible within the framework in which 2+2=4.I suspect that his entire probabilistic calculus would be transformed by such counter-examples in which 2+2=5, or -45, or pi.

In other words, he would have to say it is 99% likely that “99% likelihood” is actually some value other than we currently understand it to be.I am not sure how to make sense of something like that.There seems to be some sort of self-referential paradox that creeps in if we base our certitude in mathematical knowledge on Bayesian-like analyses.

Perhaps I have misunderstood EY. I hope I am expressing my problem clearly, but I would be interested in what others think about this.

Best,

Rufus

I can’t respond to that (try posting it on Less Wrong. I’d like to see Yudkowsky answer this) but I can say that what you said made perfect sense. Sorry if that isn’t much help :\

  (Quote)

mister k April 13, 2011 at 2:48 am

I disagree with EY when he talks about probabilities never being 0 or 1. It first of all seems to indicate some ignorance of probability theory (events with probability 0 happen all the time), but I can also cheat to get those probabilities. Lets suppose I assign a probability of p to an event A occuring, so the probability of not A occuring is 1-p. So therefore the probability of A or not A occuring is 1!

He also is a bit obsessed with pointing out the difference between Bayesians and frequentists, which, while existant, is not as big as he thinks. Its not like the tools frequentists use are wrong, but that they can be very easily misused. They are the most common tool used, and so we get confused by confidence intervals and p values. The notion that if we lived in a Bayesian paradigm practioners wouldn’t be abusing Bayesian methods is clearly wrong- I’ve already observed such things occuring in practice.

  (Quote)

mister k April 13, 2011 at 2:49 am

Also, under the definition given for prime numbers, and assuming that my observations of reality are accurate, 53 is prime. It will be until someone defines prime numbers, or numbers, or division differently.

  (Quote)

Yair April 13, 2011 at 8:08 am

I find that the argument from the beauty of the laws of nature is theism’s best argument. Though I find the theistic explanation childish, I have not found a truly satisfying atheist explanation and I’m sorry to say but EY hasn’t provided one either. I was hoping he’ll give me some idea I haven’t thought of, but he didn’t.

Since I’m commenting on this, I’d also mention that as I understand it the idea on math is not that math can be wrong but rather that we’re mistaken in our utterances on it. To use EY’s own terms, at best he can say we’re wrong about the “map” of mathematics, about the utterances we make about math, but not about the math itself – which is, after all, rigorously deduced. While that is possible, the probability that we’re wrong in our deduction, memory, or so on when we write down “2+2=4″ is far, far less than 0.001. Furthermore, by construction EY cannot make a similar claim regarding the more basic logical inference rules, as his Bayesian rationality assumes them – if logic is uncertain, the very talk of probability (uncertainty) becomes meaningless. I’m not sure, but I suspect this implies that mathematical proofs that you can grasp wholly and at-once, what Descartes would have called “clear and immediate” truths, are also immune to this kind of doubt (which Descartes would have called – and indeed did – call by another name long before EY came onto the scene…). One must assume these things (a sufficiently broad intellectual basis to stand on) in order to get the whole rational project going, and one’s assumptions are always beyond doubt from within the system built on them.

Yair

  (Quote)

stag April 15, 2011 at 2:09 am

I like your objection, Rufus. Classic.
Would challenge anyone to offer a refutation.

The only direction I can think of is in the direction of mathematical metalanguage. But this only postpones the problem ad infinitum.

  (Quote)

Rufus April 15, 2011 at 7:57 am

Me,

Here is my problem: if it is possible for a future experience to discount or disprove something like 2+2=4, then how could we possibly make sense of saying that you know 2+2=4 with 99% confidence.If such a calculation were available to provide epistemic confidence as a percentage, certainly this calculation could stand or fall based on more fundamental presumptions in arithmetic.In other words, “99% confidence” may only be intelligible within the framework in which 2+2=4.I suspect that his entire probabilistic calculus would be transformed by such counter-examples in which 2+2=5, or -45, or pi.

In other words, he would have to say it is 99% likely that “99% likelihood” is actually some value other than we currently understand it to be.I am not sure how to make sense of something like that.There seems to be some sort of self-referential paradox that creeps in if we base our certitude in mathematical knowledge on Bayesian-like analyses.

I should have said that it’s 99% likely that “99% likelihood” is the percentage value we understand it to be. The point is the same though.

Stag, I would agree, though I know very little about mathematical metalanguage. One solution that I think EY could exploit is to say something like I am 99% confident that 2+2=4 where 99% means such and such at t=0. Then if at some other time 99% turns out to mean something else, perhaps it would be possible to translate 99% @ t=0 into a new expression.

However, that assumes that the new arithmetic system is translatable from our former mistaken knowledge of it at t=0.

I think the problem is more serious for EY. In assuming that all knowledge is probabilistic, EY is susceptible to the following problem:

The claim “All knowledge is probabilistic” is either a priori certain or probabilistically true. It cannot be a priori certain, since that would be self-defeating. We are happy, then, to say that it is only probably true that all knowledge is probabilistic. But, does that work? Let’s make up the fact that it’s 99% probable that all knowledge is probabilistic. That means that it’s 1% probable that the claim “all knowledge is probabilistic” is non-probabilistic, or a priori certain. But we already said that this is a self-defeating position, i.e. it is logically incoherent. So we are left saying that there is a 99% probability the claim “all knowledge is probabilistic” and a 1% probability that the claim is self-defeating. Since the claim can’t be self-defeating and still be a meaningful claim, we are stuck saying if it is a meaningful claim at all, it is certain that all knowledge is probabilistic, which is self-defeating once again. This means that either way, the claim is self-defeating.

I am not sure how to escape this, unless I have created a false dichotomy between a priori certainty and probabilistic knowledge. If we propose a third kind of non-probabilistic knowledge, I think we may stumble into similar probables though…

Best,

Rufus

  (Quote)

Rufus April 15, 2011 at 7:59 am

“probables though…” should read “problems…”

  (Quote)

stag April 22, 2011 at 8:46 pm

Yeah, I think it is a similar problem, like many reductionist theories tend to fall into when they are asked to solidly ground their own epistemological status. Determinism, for example. Or logical positivism. You either have to take some arbitrary statement as an axiom with no foundation (“all knowledge is probabilistic”, “free will does not exist”), or justify the statement at the meta-linguistic level – but as your dilemma shows, any such meta-language can never get beyond the precise limits set by the initial statement without contradiction. Hence “all knowledge is probabilistic” will be probabilistic, as will “‘all knowledge is probabilistic’ is probabilistic” and so on. Only a behaviouristic foundation can be given to such axioms.

With determinism, as far as I can see, the same thing happens. “There is no free will”, as a speech act, is not free. Neither is “‘There is no free will’ is not freely affirmed” and so on. Jumping up to second order and third order just postpones the arbitrary adoption of an axiom. The whole lot is just ‘stuff that happens’ – including my knowledge of that very ‘fact’.

It has something of Godel’s incompleteness theorem about it. You try to demonstrate second-order completeness and you inevitably fall into contradiction…

  (Quote)

Leave a Comment