CPBD 061: Paul Almond – Why God is a Terrible Explanation for Anything

by Luke Muehlhauser on August 22, 2010 in Podcast

(Listen to other episodes of Conversations from the Pale Blue Dot here.)

Today I interview Paul Almond. Among other things, we discuss:

  • God as an explanation
  • Faith and the problem of induction
  • Is God necessary for morality?
  • Could God be beyond logic?

Download CPBD episode 061 with Paul Almond. Total time is 44:03.

Paul Almond links:

Links for things we discussed:

Note: in addition to the regular blog feed, there is also a podcast-only feed. You can also subscribe on iTunes.

Transcript

(Transcript prepared by CastingWords via two anonymous donors. If you’d like to pay for transcripts of past or future episodes, please contact me.)

Luke Muehlhauser: Paul Almond is a mysterious figure. He works in business but publishes articles on cognition, artificial intelligence, and philosophy of religion on the side. He is known for his many in-depth articles at paul-almond.com. Paul, welcome to the show!

Paul Almond: Thank you. I’m glad to be talking to you.

Luke: Well, Paul, much of your work has focused on simplicity and theory selection. How can we tell if something is really simple, and how can we go about choosing the best theory among several alternatives, do you think?

Paul: Well, for a start, ideas such as simple or complex can have a number of different meanings. In this particular context, when we are talking about theory selection, a simple theory would be one which doesn’t contain much information. And what we’ve got to start with is what we expect of a theory or an explanation. And I think some people, particularly the religious people, they have the wrong expectations. They think in terms of some ultimate explanation. And, really, you can never have that because if you explain something in terms of this, you’ve got to explain this in terms of that. You never get anything which doesn’t need some sort of explanation.

All an explanation is, it is something which is a model of reality which gives you the ability to make predictions which has less information than the previous level of explanation.

I’ll give you an example: gravity. Just prior to Newton, gravity wasn’t really popularly understood. They had two separate models for two phenomena. They had the idea of objects falling near the Earth and accelerating, believe that objects accelerate when they move downwards, and we had a model about how the planets of the solar system move around.

Now, they could use both of these models for making predictions. What Newton did, is he took both of those models, and he replaced them by a single model, the theory of gravity, which explains why objects fall down and explains why the planets orbit in the way that they do.

Now, the point of this is, he didn’t give an ultimate explanation. There’s no “why” to this. There’s no final answer. What Newton did really was he removed the amount of information we need in our model of reality. He improved our prediction mechanism, and he made it more efficient and more reliable. And that is all you can ask of a theory.

Theories are to lead to predictions. Explanations are simply theories which have been improved a bit. An explanation is simply a more economical thing than the thing it replaces, and that is all we ask of it. A theory should be expressed formally, ideally, and, for any computer specialists amongst your listeners, you could say that it comes down to the number of information bits in the theory.

So we might think of theory in terms of some sort of computer software paradigm. Does that make sense?

Luke: Yeah. I wonder if you could give another example of how our successful theories have been ones that contain less information than our previous way of understanding certain phenomena.

Paul: OK, yeah. Chemistry is a classic example. Prior to the current understanding of chemistry in terms of atoms and molecules, you’ve got all these thousands and thousands of different substances. You’ve got wood, you’ve got glass, you’ve got all the metals. And all the world is, is a million and one different special cases. If you don’t have a theory of chemistry for you to describe the world, there’s no kind of cohesive nature to it. It is just a big long list. When you’ve got a theory of chemistry, all that immediately starts to collapse a bit. You’ve still got all of this stuff. You’ve still got richness on top with all the different substances, but underneath that is the explanation.

The explanation is an explanation because it gives rise to all that, but it is a more efficient model. You’ve got a periodic table, you’ve got some physics associated with it. It might be complicated from a human point of view. The sort of complexity and the simplicity we’re talking about here has nothing at all to do with psychology. It is to do with information content. Be clear that the amount of information content in the basic principles of chemistry is much less than the amount of information content you would need to describe pretty much everything on the planet.

Luke: Right, yes. So let’s apply that type of thinking to theism. Many theologians are fond of saying that God is conceived of as a very simple being, and that the “God did it” theory is a good explanation for certain things like fine-tuning or biological complexity or consciousness or whatever. What do you think of that claim about using God as an explanation and a simple explanation at that?

Paul: I think what you have there is a total train wreck. That is my opinion of theistic explanations. For a start, when they say “God is the simplest explanation,” who says that? Suppose instead of Newton, someone had come along and said, “I think that your inverse square law of gravity is much too complicated. I’m going to explain the motion of the planets in terms of woo-woo.” And you say, “What’s woo-woo?” And they say, “The woo-woo is the thing which makes the planets move. It’s the single perfect entity which makes them go round in their orbits. And what can be simpler than woo-woo?”

Now, what you might see there is I’ve pulled a fast one because I’ve cheated. I haven’t actually given you the theory. All I’ve done is given you a word. Anybody can say that a word is simple; anyone can say that a concept is simple if they don’t have to explain the details. The thing is when I dropped my woo-woo theory in there, it didn’t have any predictive power.

I said earlier that the idea of theory is a model used to explain and predict things. Well, to predict things really. And that’s all they can really do. You can never have an ultimate explanation. So all a theory is good for is making predictions. A God theory isn’t going to make predictions unless you specify it with a lot more information.

So you can either specify it vaguely, and just say God it, or you can try to turn into some sort of predictive model. Let’s look at what predictive model you’ll have. God is a person. What we have is the insanity of using a psychology-based model.

Brains are about the most complex thing in the known universe. The only psychology we’ve ever seen is in brains, and it is by far the most complex thing we can think of. And you’re dropping that into your bottom-level description of nature, where the last thing that you want is complexity; where you should be wanting simplicity. It is an absolute nonstarter.

Luke: What I hear you saying is that either the theologian is going to say that God is very simple and God did it and that’s all the explanation there is. But in that case, that’s really just a woo-woo explanation. It’s a word, and there’s no predictive power to the explanation at all. It’s not even worth calling it an explanation. Or they could actually turn the God hypothesis into a predictive model and make it into an actual explanation. But the only conceivable way they would do that, because God is thought of as a person and not a mechanical force, is to talk about God’s beliefs and desires and so on. And so this happens because God desired it and he willed to do it and that kind of thing.

But then you’re invoking psychology, and the only psychology we know of is just about the most complex thing in the universe. So that really makes things so much worse. You’re offering an explanation that is more complicated than that which you’re trying to explain, which doesn’t really work.

Paul: Yep, I would go along with that. Now, they could argue with this. They could say that I’m trying to impose some kind of computational paradigm on God. An obvious thing is to say, “Aren’t you trying to think of God as some sort of computer program running on a substrate, which needs a lot of information to write?” That’s an obvious thing they’re going to say. The point here is I’m not saying that at all. Here, I’m not criticizing God. God can do what he wants and you’ll never know that he exists. I’m criticizing the explanation. The explanation is a psychological one. The explanation would require an enormous amount of information as a predictive model and I’m not assuming that God has to work like a computer program.

But what I’m saying is the explanation has to be formally expressed to be any good and a formally expressed God explanation is… The way you described it is very well, it’s got to just be a complete monster of inefficiency.

Luke: [laughs] A monster of inefficiency.

Paul: Let’s look at how monstrous it would be. Suppose you had a computer and you said to your computer, “Act like a person,” and you’d have to describe to your computer how a person works. That is going to be one big description.

Luke: Yeah, that’s going to be a computer program with trillions of lines of code.

Paul: Mm-hmm.

Luke: Yeah. So you’re problem here really isn’t that you’re saying necessarily that God himself must be complex in some way, but the God explanation that you offer, if you put it formally, then it just becomes hugely complex. That’s where the real problem with offering God as an explanation comes from.

Paul: Exactly! And the theists could counter that by saying, “Well, God is beyond formal explanation.” Now, if they want to do that, fine. But they are then in the position of someone who’s brought a pair of skis to a football match. They wanted to play this game. They wanted to play the theory selection game and try to win it with their theory. If they want to win it with their theory, their theory should be able to survive semi sort of formal theory selection processes. If you want to bring along the theory and then say you don’t want to play and it can’t be tested like a theory, then it can’t be expressed like a theory, you are then doing something quite ridiculous.

One point I would make as well on this, I’m not saying that any explanation of psychology and intelligence is wrong. That now, you’re talking to me on the phone and you were explaining the sounds you hear in terms of psychology. You think there’s a person you’re talking to, you’d happen to be right. The difference here is, that isn’t intrinsic information. You know that I can be accounted for by physics, by the theory of evolution.

It’s sort of the assumptions you have to make to believe I’m in the world, other people in the world, is not the same. The problem with God is all of that is assumed at the bottom level of reality. You’ve got all of this psychology, all of this complexity with no hope of anything of underpinning it which is simpler.

It’s this whole non-contingency, fundamental basic nature. All of the things theists claim of God, there is not a base of anything, there’s nothing underpinning him. They’re all what fatally kills him off, because it is only something simple underpinning him that would give you any chance at all of saying that there could be some simplicity behind this which might give it merit.

Luke: So this is actually, in a way, much, much worse than talking about explanations in terms of human psychology. Because at least with human psychology, we’re starting to get some kind of hint at the story of how human psychology arises from much simpler mechanical and well-understood processes, whereas the theist is positing some kind of divine psychology as the most fundamental thing in the universe with nothing behind… There’s no God atoms supposedly, according to theists, that make up God’s mind. So this seems just even worse than human psychology, as far as I can tell, for an explanation.

Paul: Exactly. In fact, if you think about it, it’s worse than anything you could possibly imagine, really. I mean, let’s throw out some things. Aliens. Theists often get upset that a lot of atheists think aliens might exist. Alien psychology could be accounted for by the theory of evolution, which is a very, very efficient way of explaining things. And in fact one of the ingenious things of the theory of evolution — and I think the most important thing — is that the theory of evolution allows us to accept the presence of other minds, other psychology, the potentiality in the universe, without having to have this enormous baggage of intrinsic information that you have to assume.

Which means it only leaves you with God as an issue. It deals with everyone else’s mind. There’s no way around it. It really does make everything these people say a non-starter.

Luke: Yeah. So you were starting to compare a God-based explanation to, say, an alien-based explanation. And one of the reasons that an alien-based explanation for — let’s say the pyramids, or consciousness, or whatever. Let’s say the explanation for those things was that aliens did it. Now, with our current information that’s an extremely implausible explanation. But at least there the information content of that theory could be much lower than the information content of a necessarily predictive God hypothesis, because we have some ideas of how aliens could exist by way of more simple processes, like evolution and chemistry and physics. Whereas the God hypothesis is just starting out at base as the most complex, information-rich hypothesis you could ever imagine.

Paul: That is exactly correct. I couldn’t have put it better. And I agree with you. And so no listener is misunderstanding this, I am not trying to push things like aliens as an explanation for anything. As you say, the idea that the pyramids were made by the aliens is implausible. But even though to me, that is a kind of implausible, ridiculous idea, it is nowhere near as ridiculous as the God concept. The God concept is so ridiculous that it’s more ridiculous than all the other ridiculous ideas, because of its huge amount of information content.

Luke: Is it really fair to criticize the theists for coming to a football match with skis in the sense that aren’t there other accepted ways of formulating theories and then comparing them on something other than information content? I mean, aren’t there systems of evaluating theories that are used by scientists or philosophers of science that the theists could appeal to?

Paul: That’s a fair comment. And in fact, I’ve got to be honest to you, the sorts of methods I’m talking about are rarely used in practice, because they are not really easy to apply. Most of the time theory selection is done by an intuitive process. When Einstein came up with general relativity, he didn’t go home, write the theory down, turn it into bits of computer code, count them, and see how many there were. He knew he was right. It was an intuitive process.

Now what I’m saying is, when Einstein did that, he would have had some intuitive grasp that he had somehow reduced the information content of science. I’m not saying he knew that specifically. I’m not saying he consciously knew that. But part of the way the human brain works when it goes for patterns – we actually do this in daily life. We try to reduce the information content we need to explain things.

And all I’m saying is this idea of information content really has to underpin everything. In other words, if you propose a theory to me, I might not be doing some formal process of counting all the information in it, but I should at least intuitively have some idea of what sorts of results I would get if I did. And that has been what we’re talking about.

And this basic process, I really do think it has to come down to that. And I can justify it. The reason I can justify it, is it’s about specificity. We don’t know what sort of universe we’re in, but let’s say that there’s a whole range of possible worlds we could inhabit which are consistent with what we know about.

We could live in a world where the sun’s going to rise tomorrow as it does now. We could live in a world where the sun’s going to explode tomorrow. We could live a world where the pixies are going to land outside of our house. All this kind of stuff.

Now, we want to have some sort of expectation about what’s going to happen tomorrow. We have to make some model. If we come up with a model of reality which has a lot of information in it, a much smaller proportion of all the possible worlds we could inhabit will actually comply with that model.

If we come up with a model which has a smaller amount of information, the chances, the proportion of possible worlds which we could inhabit which follow that rule, which comply with that, are going to be larger.

So when you make your model very, very big, you are getting specific, and it is much less likely that the reality that you live in – out of all the possible realities – actually agrees with you.

And that’s why it really has to come down to information content. Every bit of information you add to your theory makes it less likely that reality is going to actually obey it.

Luke: Now do you think that it’s wise to express Occam’s Razor in terms of information content?

Paul: Yes, I do. I want to be clear as well on this. This is not necessarily what Occam said. Occam came up with something which is possibly a bit vaguer. Occam says don’t multiply entities unnecessarily. Now I think the word “entity” is so vague. What’s an entity? It might be an object in a theory. But where does one entity end and another one start?

I would say the modern equivalent of that is we could consider an entity to be simply a bit of information in the theory. And this actually makes sense. We should view theories in terms of information content. We might now always be able to formally apply that process – we’re going to be using human intuition a lot of the time – but that is we should be looking for.

When we look at a theory, we should look at what’s it doing, what’s it’s predictive power, what’s its information content relative to other theories which achieve the same ends.

Luke: Yeah. And we humans are limited beings, so we’re always taking shortcuts. We can’t calculate every single thing all the way to the millionth digit. But what you’re saying is that objectively it’s going to be the theory with the least information content that achieves the same predictive results that’s going to be the most probable in this base of possible worlds. And so what scientists and philosophers can do is try to guess as best they can by looking at the theory what the information content would turn out to be if we were actually able to do all these calculations.

Paul: Yes. And I think a lot of the time it is done simply by thinking, does it look right? A lot of the time, scientists have often said they look for beauty in a theory, they look for elegance. And I think that is human intuition, doing these similar sorts of things.

Luke: Moving on to another topic, some believers will bring up Hume’s Problem of Induction to say that science requires just as much faith as religion does. Could you tell us what was Hume’s Problem of Induction? And then what do you think of the theists’ claim that science requires just as much faith as religion does?

Paul: I think it is wrong, and I’ll try and give you an idea of why I think this. What people think of as Hume’s Problem of Induction is this: we see patterns in the past and then we expect those patterns to continue in the future. Hume’s Problem of Induction, I consider it in terms of a weak sense and a strong sense. The weak sense says that you might not be certain that the patterns are going to continue in the future, but you might be able to apply statistics.

Now if you apply Hume’s Problem of Induction in a strong sense, you might go even further – and this is what some theists do. What we say is the past is no evidence at all for any sort of expectation of the future, or at least so you can’t philosophically justify this.

In other words, the sun comes up every day for a million years, is the sun going to come up tomorrow? You have no philosophical justification for actually saying that. However, we expect that to happen, and we expect a scientific law, which has worked for a hundred years, to carry on working tomorrow.

What theists is saying is that is faith-based, and therefore what we are trying to do is we’re trying to lower science to the same sort of level as religion. I think this is wrong because — some of what I’ve already said might have started to approach some of this. If we want to know what is going to happen, then we should consider these sets of possible worlds. We could call this a reference class. A reference class is just a set of all the possibilities in some statistics exercise.

What people are trying to tell when they’re doing this, we’re trying to say that the reference classes put together bit by bit. In other words, you’ve got a world where the sun rises every day for a million years and then explodes, you’ve got a world where the sun rises every day for a million years and then changes into tea leaves, a world where it rises every day for a million years and it rises again normally, and so on.

What we’ve got to look at is how this reference class is constructed. If you’re going to construct a reference class by saying, here’s an instant in time, stick another instance in time to it, stick another instance in time, another one and so on, and you’re going to do that for every sort of world, you are just building this set of possible worlds up, bit by bit.

You’re going to end up with these weird creative results where there’s no real reason to presume that the future is going to be anything like the past, because the way you’ve constructed your reference class is just by randomly sticking bits of worlds together.

What I’m saying here is that is a flawed idea. What we should think of as the reference class is the formal description of the world as a single object. So really what we should do is we should look at the state of reality, we should look at the world together with all of it’s time, and we should say the world and everything which happens in it, all of it’s entire history, has a description.

We would never get that description, it’s going to be inaccessible to us. But that description is going to be some kind of formal description containing information, and the reference class of possible worlds we could be living in should be made up like that.

Now the point is of this, a world where something crazy is going to happen, it’s going to need a lot more information to actually describe a world which happens like that. In another words, if you want to write a description of a world down where the sun rises every day for a million years and it then rises again, that isn’t too bad.

If you want a world where the sun rises every day for a million years and then changes into a tree, if you want that sort of world, it is going to take a lot more information, because to make the crazy things happen, you’re going to have to start putting in lots of information to describe the crazy things.

To have something sensible happen over a period of time, it’s going to be very, very efficient in terms of information. What I’m saying is, when you consider every possible we could be living in, the ones that behave reasonably sensibly, the ones where the future is very similar to the past, the ones where the future is a sensible continuation of the past, those are the ones which are going to represent the much bigger set of that sort of world, because there’s a much smaller amount of information needed to actually do that. So they’re going to be the much larger set of the possibilities.

In that sense, there’s no reason to take this problem of induction seriously at all. Simple statistical common sense would require us to just expect the future to be like this.

Now the one thing the theists could criticize me on here is one decision I’ve made. The theists seem to be saying – they aren’t saying this explicitly, but they are implying it when they come out with this nonsense – that the reference class is simply the history of every possible world written down.

I’m saying the reference class is the formal description of everything that happens all the time in every possible world. And they could say I’m wrong to do that. They could say, for example, “How do you know that worlds have to follow these formal descriptions? How do you know that it has to go along like that?”

I’m not making any assumption at all. All I’m saying is that a formal description of reality is about the most basic thing you could have. You aren’t even assuming that reality is going to be sensible. You could have a formal description of reality which acts crazy. You could have a formal description of reality which acts completely randomly. The formal description would just be, do this crazy thing, do this crazy thing, do this crazy thing, and so on.

Luke: Yeah, you can write a formal description of the world in which the sun suddenly turns into a tree. You can do that.

Paul: Yep, you can do that. And if you keep going, you’re going to have a very, very long description of that reality. And it is very, very unlikely you live in such a reality. It is much more likely that the sun is going to rise tomorrow because less information is needed to make it do that, and therefore it is more likely that a bigger proportion of the possible galaxies will have that. And it’s more likely that one of those happens to be the real one.

Luke: Right. So if we’re looking at the total space of all the ways that the universe could be, all the possible worlds that are out there, then the ones that have less information content are going to be a lot more probable. They’re going to take up a much larger space of the probability space of possible worlds. And so it’s far more likely that we’re in one of those worlds, rather than in one of these really, really complex worlds where there’s just a million lines of code, shall we say, that say do this crazy thing and then do that crazy thing next and then do that crazy thing next.

So when we’re looking at the space of possible worlds, it’s much more likely that we’re in a world that makes use of all of these regularities that would make us predict that the future will be pretty much like the past.

Paul: Yes, with one qualification. I’m not necessarily saying that the world has to be simple. I’m not saying that simple worlds are going to be more common than complex worlds. Our explanation of the world, our description of that, is different than the actual world. We might have a complex description of the world, and then that would mean that very few of the possible worlds would actually conform to that description. Now, we might have a simple description of the world, and that would mean that many more of the possible worlds conform with that description. Each of those worlds might be simple or complex.

So it’s not about whether the actual world itself is simple or complex. It’s whether a description made by humans, which is only a partial description of the world which is complex, is likely to match up with the actual real world.

There’s a slight difference here between the descriptions and the actual worlds. The description is only a small piece of the world. What I’m saying here is the real world is going to be all the information in our description and a load more information besides. So just because we’ve come up with a simple theory it doesn’t necessarily mean the world is simple.

Luke: So why is it that the worlds that can be described more simply take up a larger space in this probability space of possible worlds?

Paul: It is because the more information you put into a model, the more specificity you are having, the more chances reality has for not matching it. Let’s say that you have a theory and you describe it with three binary digits: one, zero and one. It means that for the world to match that theory, it’s got to agree with one, zero, one. Now if it’s just random, it’s a mess, you’ve got a reasonable chance.

Now if you say the world has to match a theory which has one, zero, one, one, zero, one, every time you add a digit you are demanding far more of the world to match that.

I’ll try and give an analogy in more human terms. Let’s suppose you were in a library, and that library contains every novel which could ever possibly exist. So the reference class here is every single novel which anybody could write is actually there.

Now you read part of a novel, and let’s say that the novel describes some events happening. You want to guess what happens next, but you’ve put the book back in the library, and you can’t actually see what’s next. So you want to guess what happened next in that novel that you’re reading. You are going to come up with some kind of description of what was happening in that novel.

If your description is actually a very unwieldy complex description, it is not likely that many novels in that library are going to match up with your description, because the more description you have, the more you are demanding of it. If your description is economical, then it’s much more likely that you’re going to find a lot of novels which actually match up with it.

It’s really a statistics game. It’s about specificity, and it’s about how the more you put into a model, then the less likely it is that the reality is going to match up with it. Because the reality’s got to match up with every single bit of your model. The reality’s got to match up with this bit, that bit, that bit, that bit, that bit. And the more bits you have, the less likely it is that the reality is going to play along.

Luke: So the response to the theist is, no, science isn’t depending on faith in the way that religion is. We actually have a good statistical reason to expect that the future will behave much like the past.

Paul: Yes, exactly.

Luke: Paul, another topic now. Many theists claim that God is somehow necessary for morality. What do you think of that idea?

Paul: I think it’s flawed. But then again, you knew I was going to say that.

Luke: I knew it!

Paul: OK. Firstly, all the morality which you see in a religion tends to be human morality which has been projected onto the religion in the first place. The best indication of this is when theists start to interpret the holy books, most holy books contain things which are objectionable by modern standards. And you get theists saying, well, that doesn’t apply now. A good example is, there’s a bit in the Bible where you’re supposed to stone disobedient children to death. Few Christians would actually do that. And that’s interesting because if God is an ultimate source of morality, why not do it?

Now, some Christians might say, well, hang on, God didn’t really write that. Not all Christians have a literal interpretation of the Bible. So some Christians might say, well, that isn’t really part of what God said.

Now if you’re doing that, if you’re picking and choosing like that, you are referring to a morality which is independent of the actual Bible. If you’ve got to refer to some sort of morality outside of the Bible to decide which bits of the Bible are the ones which God came out with and meant, and which ones aren’t, it rather makes a mockery of this idea that the Bible itself is a source of morality.

It’s the other way round. What’s happening is people are fitting the Bible, or any other religious book, to the morality which they actually find acceptable. It doesn’t mean that’s always good morality.

Somebody might live in a society where they think it is OK to kill people who have a certain lifestyle. And therefore, if a religious book says that, then that is fine, they’re going to do it. But if a religious book says something which they find objectionable, or which causes them problems…

Luke: Yeah, like give to the poor, or something. That’s no good. [laughs]

Paul: Yes. Let’s look at one example in the Bible. In the Bible you’ve got passages which are rather prejudicial to gay people, and you’ve got passages telling you not to eat shellfish. Now amongst a lot of conservative Christians, which one of those is more common? Of course, there’s probably a lot more prejudice against gay people than there is against eating shellfish. I think the reason is simple. Having prejudice against gay people is probably a lot more fun, if you’re a fundamentalist Christian. Not eating shellfish is less fun, and it’s inconvenient.

So what you’re doing is, people are just picking and choosing the bits that they want, and that is not being used as a source of objective morality. It’s being used to fit what you want to do in the first place.

Luke: Another question is when theists get trapped in the corner and it seems like the atheist has shown that their idea of God is incoherent or problematic, they’ll sometimes resort to saying that God is beyond logic or superior to logic or something, and therefore logic can’t disprove God. What do you think of that approach?

Paul: I think it’s another fairly cheap evasion tactic to try to put their God beyond any sort of argument or refutation. Firstly, if a theist comes out with a lot of claims and proof and evidence for God, of anything that God has done, and God is beyond logic, the problem is I’m not arguing with the God. I’m arguing with the claims, and the claims aren’t beyond logic. The claims are just information.

Let’s look at a simple example. You’ve a got Christian apologist, William Lane Craig. William Lane Craig writes a book claiming to show how there’s a God. If I argue with that book, then a theist could say, “No, you can’t do that. God is beyond logic. God won’t be subject to your petty, atheist, evil Satan logic.” And it’s supposed to be game over then.

But hang on. I wasn’t arguing with God. I was arguing with a book by William Lane Craig. Is William Lane Craig’s book beyond logic? I think not.

I think the idea of anything beyond logic is ridiculous in the first place, so I’m not even entertaining it as a coherent idea. But even if we did, it is ridiculous to think that someone’s ideas and claims and the expression of those claims is somehow immune to attack.

Even worse, even if we assumed it was coherent for something to be beyond logic and even if we assumed the universe was caused by something beyond logic, even if we accepted that, you would actually destroy any chance you had of proving a God.

Because if you’ve got something beyond logic, you imply there’s some area of philosophy or there is some sort of domain which is beyond logic, which logic can’t enter. I supposed it’s like a philosophical twilight zone. You’re saying there’s a twilight zone in philosophy where logic doesn’t work. The universe came out of this twilight zone where logic doesn’t work, and that’s it. But then what do they do?

They try to tell us the cause of the universe must be intelligent. The cause of the universe must be personal. The cause of the universe must want us not to work on certain days of the year. The cause of the universe must have a mind. It must be a person. It must have consciousness and intentionality.

What they are doing is they are telling us that there is a logic-free twilight zone in philosophy, and then they are telling us what is going on inside it. How do they justify this? If a theist says that the cause of the universe must be intelligent, why? If the cause of the universe is beyond logic, then surely the cause of the universe could be a pile of carrots. And if you’re going to say, “Well, how could a pile of carrots cause the universe?” I don’t know. It’s beyond logic. It’s not really something we should have to consider, is it?

Once you’ve accepted something beyond logic, you’ve thrown away any chance you had of making any arguments about what that sort of thing is or what it’s doing or what its properties are. And the fact that theists consistently claim that God is beyond logic and consistently tell us what the properties of God are, using various logical arguments, implies that this is purely just a self-serving argument.

They are trying to have it both ways. They are trying to tell us about a logic-free domain, and then they are trying to use logic in that domain to tell us what it’s like in there.

Luke: Now, Paul, you’ve leveled a lot of criticisms at theism, and I’ve been helping you along. But do you think that atheism can be positively justified? Do you think we can say that we know there is no God? Or is your atheism more of a negative position? Like, well, we can’t really know, but so far we don’t have any good reasons to think that God does exist. Which way would you express it?

Paul: OK. Well, even to have this discussion, we’ve got to assume someone’s got a coherent description of God, and with a lot of theists I’ve spoken to, that’s a bit of a reach. However, let’s assume someone’s managed to put a coherent description together. I will actually say there is no God. That is my position. I don’t sort of go along saying, “Well, I don’t believe in a God. I have no reason to think that a God exists.” I actually just say there is no God, and I can justify that position.

This is how I would justify it. In everyday life, we could throw up extreme skepticism on practically any statement we could make, and we don’t usually acknowledge that sort of skepticism, that sort of possibility.

I have an example. If you’re sitting in a restaurant with a cup of coffee on the table in front of you, then if someone says to you, “What is that on the table in front of you?” you are going to say, “It is a cup of coffee.” Now, when you say that, have you proven with 100% confidence that that cup of coffee isn’t a hallucination, that it isn’t a miniature alien battleship from the planet Zog in stealth disguise mode, that it isn’t somehow an emissary from the pixie empire in disguise mode?

Now the point is, what I’ve just said sounds stupid. Most people, even theists, would say that because of lack of evidence for such ridiculous ideas and because they are so extreme, they don’t even need to be given house room. But in normal language, we don’t need to acknowledge that level of uncertainty. We can just say it’s a cup of coffee.

I’m quite happy to say that there is no God in the same way that I will say that is a cup of coffee on the table. It doesn’t mean I’m claiming 100% proof. All it means is that I’m satisfied that the extreme nature of the God concept — and we’ve been discussing that earlier in this interview, why I think the God concept is so extreme — combined with the lack of evidence needed to support such an extreme claim makes it so implausible, and it makes for chances that that hypothesis is correct so ridiculously low that, essentially, it is probably less likely than lots of other things, like cups of coffee being aliens in disguise, that we would never even consider referring to in normal everyday conversation.

And it means while I might philosophically accept that there could be some sort of possibility, but that is purely mathematical hairsplitting. In normal everyday language, it is well below the level at which it would be recognized.

Luke: I’m going to start being more wary about cups of coffee and seeing if they’re setting their lasers to kill or something.

Paul: Actually, given the information loading issues I’ve mentioned, you probably wouldn’t be surprised if I said that that is more likely than the existence of a God. Aliens who evolve, who build miniature battleships disguised as cups of coffee, who come here to spy on us, could be explained with a lot less information than you need to explain a God. It is a ridiculous idea. Both of them are ridiculous ideas, but you could probably see how God just wins out for ridiculousness every time.

Luke: It’s been a pleasure speaking with you, and thanks for coming on the show.

Paul: Thank you very much, and it’s been a pleasure talking to you.

Previous post:

Next post:

{ 45 comments… read them below or add one }

Steve Maitzen August 22, 2010 at 7:07 am

Luke,

Fascinating interview, and thanks for posting a transcript, which I found indispensable. I think Almond’s comments are spot-on in many ways, but let me defend Hume’s challenge to induction against his reply.

Almond responded “Yes, exactly” to your claim “We actually have a good statistical reason to expect that the future will behave much like the past.” But that claim answers Hume only if we have non-circular reason to expect statistical patterns to be a reliable guide for prediction. Almond’s reply faces a dilemma. Either it assumes that

(a) Statistically unlikely events can’t happen,

or it assumes that

(b) The tautology “Statistically unlikely events are statistically unlikely to happen” gives us information to guide prediction.

I expect Almond denies (a), although I’m not sure. But (b) is surely false: no tautology has enough content to guide prediction at all. Hume lives on.

  (Quote)

Mike August 22, 2010 at 7:08 am

Not a philosopher here, but someone who is familiar with Kolmogorov & Shannon theories of information. I do have a mild reservation about the assertion that [paraphrasing] “the smaller the description of a universe, the more likely a universe is to agree with it.”

Paul uses the analogy of library books matching up, but let’s use an analogy of numbers. If I pick a completely random real number, then its shortest description will not be any shorter than simply writing down the entire number itself. This would be one of those “weird” universes where the sun turns into a tree. Note that I could make a description also that says “any number that contains _____ as a subsequence”. What I just wrote would be a long description that matches infinitely many numbers.

On the other hand, consider the number pi. You might think that pi contains a lot of information, since it is irrational and its digits never repeat. However, I could describe a pi with a very small amount of information (in fact, computer programs to generate pi can be less than 100 bytes). By analogy, this is one of the “simple” or “regular” universes. But this short description doesn’t make it any more likely that a number will match the description. The description of pi is very short, but it is still completely exclusive — only one number matches the description.

So from the perspective of information theory, I don’t agree that a shorter, “simpler” description necessarily is less specific / more inclusive. I have given an example of a long description that matches infinitely many numbers, and a short description that matches only one number. Perhaps my problem is an artifact of a simplified, informal exposition for the benefit of the non-specialist audience, so I’m open to having the formal/mathematical details clarified.

Anyway, interesting interview, thanks for posting it!

  (Quote)

lukeprog August 22, 2010 at 7:54 am

Interesting critiques, Steve Maitzen and Mike. I hope Paul Almond replies to them.

  (Quote)

lukeprog August 22, 2010 at 7:56 am

Steve Maitzen,

Which parts of Almond’s case do you think are defensible?

  (Quote)

Steve Maitzen August 22, 2010 at 10:11 am

Steve Maitzen,

Which parts of Almond’s case do you think are defensible?

Luke Muehlhauser,

Well, for starters I’d say he’s bang-on about the senselessness of regarding God, or anything else, as “beyond logic.” His defense of induction against Hume’s critique leaves me less convinced, however.

  (Quote)

Leomar August 22, 2010 at 10:35 am

Thanks for the transcript, it was really helpfull, as I’m from a non-english country sometimes interpretation of the pronunciation gets in my way. Post it everytime you have the chance.

  (Quote)

Chris K August 22, 2010 at 10:57 am

Luke,

You say that theists sometimes claim that God is beyond logic, and Almond states that theists consistently claim that he is. Now, maybe it’s because I don’t live near these theists, but I only know of one person who has claimed that God is beyond logic: Descartes. And many if not most theists will argue that God is not beyond logic, as Plantinga has argued in his “Does God Have a Nature?”.

I agree that the idea that God is beyond logic is a pretty absurd claim, and I wonder, who exactly is claiming such things?

  (Quote)

Muto August 22, 2010 at 11:20 am

Chris K,
Quite a few people I know argue this way.
However, professional philosophers seem to avoid this line of argumentation.

  (Quote)

TaiChi August 22, 2010 at 4:17 pm

I’d like to second Mike’s comment – I too am not sure why a smaller description is more likely. In fact, since the shorter descriptions under discussion were scientific theories which manage to generate highly specific predictions over broad swathes of phenomena, I would’ve thought that these would be incredibly unlikely, a priori.

  (Quote)

Sly August 22, 2010 at 11:01 pm

Thank you SO MUCH for the transcript. Podcasts are too slow for me, and I vastly prefer reading.

  (Quote)

Paul Almond August 23, 2010 at 12:58 am

and indeed Almond is here. I would like to thank people for making these comments. I’m not able to spend time on giving answers to these right now: I have things to do. However, I would like to assure everyone that I will find time to deal with these comments shortly.

  (Quote)

lukeprog August 23, 2010 at 1:15 am

Sly,

I would do a transcript for every episode if I had the time or money, but I don’t. I rely on donations for that.

  (Quote)

Mark August 23, 2010 at 9:57 am

Luke, did you discuss theists’ argument that God actually is extremely simple, because his properties all flow from the simple description “having all perfections?”

  (Quote)

stamati August 23, 2010 at 11:16 am

Mike and Tai Chi,

I disagree with your criticisms on an intuitive level, but I’m having a hard time formulating my thoughts, so I apologize in advance and please bear with me.

Tai Chi, I think the point is not the amount of information or specificity of a theory’s prediction, but the amount of information in the theory itself.

Perhaps what Almond is saying makes most sense in terms of volume of information in theories regarding the same phenomenon. I think you were comparing apples and oranges, Mike. So for example, wasn’t there some theory of cosmic motion that worked out, but was super bulky? I think it was the celestial spheres model, and although it worked, more information had to be added to it as time progressed in order for it to continue working. Then Kepler comes along and is like, “Whammy! This shit is elliptical.” His theory reduced the amount of information needed to accurately predict the apparent locations of the planets, and unlike the celestial spheres model, is representative of what the planets do IRL.

So given two theories that predict the *same* phenomenon, e.g. the universe, the theory with less information is the one that will correspond more closely to said phenomenon. Right?

  (Quote)

Mike August 23, 2010 at 2:46 pm

stamati, Kepler’s model is a nice example to keep in mind. Epicycles were required to get the Ptolemaic model to work, but even as observations got more detailed, the epicycles needed epicycles and so on.

To be clear, I do think we should favor simpler theories, and I think it makes sense to formulate Occam’s razor in terms of information theory. But I don’t know if I have good philosophical reasoning to support it. The reasoning given by Almond seems to me to be a misuse of information theory.

Why should the universe favor simple or elegant theories? What’s stopping physics from being very convoluted to describe? A Ptolemaic would object to Kepler because circles are more fundamental and “perfect” than ellipses, so dammit circles had better be involved in planetary motion. Here it was wrong to assume the universe obeyed some kind of standard of “elegance” — was it the principle of elegance that was not well-founded, or were we just measuring elegance in the wrong way?

  (Quote)

Richard Wein August 24, 2010 at 4:13 am

Hi. Lots of good stuff in that interview. Thanks. I do have an objection, though, to Paul’s dismissal of the problem of induction. Even if he succeeds in justifying induction per se, he must still appeal to something more primitive that cannot be justified. No method of justification can itself be ultimately justified. Let’s say for the sake of argument that Paul’s justification of induction is valid in terms of our usual rules for justification. Why should we assume that a conclusion justified by these rules tells us anything useful? If you appeal to past experience, then you’re invoking another rule. How do you justify that rule?

And there are other places where even scientists must put a degree of “faith” in their judgements. All reasoning from observation to explanation suffers from the problem of “underdetermination”. We may invoke such principles as parsimony to justify “best” explanations. But even if we assume (for the sake of argument) that you can fully justify those broad principles, the principles cannot be precise enough to formally (by deductive logic) identify one particular explanation. Paul wants to cash out parsimony in terms of “information”. That may well be a useful approach, but I doubt he will be able to come up with a definition of “information” which is both sufficiently general and sufficiently precise to avoid the need for personal judgement. There is always going to be an element of intuitive (subconscious) judgement involved in the identification of best explanations.*

Finally, even if we had a perfectly precise set of rules of justification (which is perhaps the case in a purely formal system like mathematics), we can never justify the claim that we have applied the rules correctly. We may set out a step-by-step mathematical proof. But how do you justify the claim that one step follows from the previous one? You may invoke the rule that allegedly justifies that step. But still, how do you justify your claim that you’ve applied the rule correctly?

Ultimately, we cannot avoid the need to put some faith in our own cognitive abilities. So the claim that “science requires faith too” is correct to that extent. It’s misleading, however, to say that science requires “just as much” faith as religion. Science attempts to reduce the need for faith to a minimum by subjecting beliefs to conscious (and communal) rational scrutiny as far as possible. Moreover, our practice of science (or broader scientific thinking) improves our intuitive judgement (the part of our reasoning in which we need “faith”).
———

* I think some philosophers like to appeal to the conclusions of an “ideally rational person” as the standard for whether a belief is rationally justified. If such a person were theoretically possible you could imagine in principle writing a program to emulate their thinking and make that program your perfect rule for justification. But I don’t think the concept of ideal rationality makes sense.

  (Quote)

Richard Wein August 24, 2010 at 5:42 am

On reflection I’d like to give a more specific objection to Paul than the one I gave above. I think the crucial step in his argument was this one:

If we come up with a model which has a smaller amount of information, the chances, the proportion of possible worlds which we could inhabit which follow that rule, which comply with that, are going to be larger.

OK, if we select a model and then pick a world with uniform probability from the set of all possible worlds, there is a higher probability that we will pick a world that complies with our model if we selected a low-information model. But how do you get from there to justifying a preference for the low-information model?

I’m not denying that past experience causes us to prefer such models, or that preferring such models has been an effective strategy in the past. I’m just saying you can’t get away from the fact that there must be some level at which we are just doing what’s worked in the past, without having any ultimate justification for doing so.

By the way, are you familiar with the AIC and BIC information criteria for model fitting?

  (Quote)

Hendy August 24, 2010 at 7:03 am

@Mike:

Been pondering these snippets:

Note that I could make a description also that says “any number that contains _____ as a subsequence”. What I just wrote would be a long description that matches infinitely many numbers.

However, I could describe a pi with a very small amount of information (in fact, computer programs to generate pi can be less than 100 bytes).

…I don’t agree that a shorter, “simpler” description necessarily is less specific / more inclusive. I have given an example of a long description that matches infinitely many numbers, and a short description that matches only one number.

I disagree. A program to iterate through numbers containing some sequence xyz is shorter and simpler than even tinypi4 which has ~100 lines of code. With a few ‘for’ or ‘while’ blocks in java and an array of variables to track the xyz sequence, one could print out all the sequences from 1->whatever that contained xyz. It would not take 100 lines. Heck, perl or awk could probably do this in a couple lines.

Also note that for the description of any sequence, all a computer (or human) needs to do is match numbers. Compare that to your linked author’s pi program which uses “Klingenstierna’s arctangent formula for pi”. One needs more definitions and established principles to begin getting to pi. I think we’re taking for granted that computers already have all of the mathematical foundations built in to use all kinds of subroutines and methods pre-hardcoded for use. All you need to match a sequence (if we’re talking 0′s and 1′s) is awareness of what they look like or perhaps simply knowledge of whole numbers.

Finally, I think what Paul says still holds. The more specific you get, the less likely you are to find something that matches that description. We already know pi exists, but I took his point to be discussion possible worlds. If you began putting forth numbers that might be contained in possible worlds, wouldn’t it be more a priori likely to posit whole numbers or just the numbers 0 and 1 vs. pi out to 5,000 digits as existing in those worlds? These programs are also getting you an approximation (though it be extremely precise) and there’s nothing saying that whatever it outputs out to 5,000 places might match an infinite set of “fake pi’s” that differ on decimal digits 5001 -> 10^8.

Anyway, just some thoughts.

  (Quote)

Hendy August 24, 2010 at 7:16 am

@Mike:

Why should the universe favor simple or elegant theories? What’s stopping physics from being very convoluted to describe?

Now that’s a great point/question!

Maybe this gets back into induction and Hume? We assume that the universe favors them because that’s how we’ve found things in the past? I’d be interested to see what others say on this one.

Fluid mechanics can be very convoluted to describe! Either it’s an aberration from the simplicity/less info system or we just don’t really know yet. There are a good number of formulas that still rely on empirical data and thus one can get some extremely odd exponents and constants to make things work. Nowhere near as elegant as many other areas…

Regarding circles and ellipses as orbit shapes, perhaps we focus too directly on the orbits (as if they could be any shape and thus why an ellipse) rather than asking if something simpler than G*m1*m2/r^2 could be the case such that circles would be preferred? Circles only work (from what I’ve read) if one mass is negligible or the two are equal.

  (Quote)

Mike August 24, 2010 at 8:00 am

A program to iterate through numbers containing some sequence xyz is shorter and simpler than even tinypi4 which has ~100 lines of code.

Nope, it depends on the choice of xyz (maybe “simpler” in terms of programming constructs and mathematical background needed to write such a program, but certainly not shorter). Choose xyz to be a million-digit number. The first thing you learn in Kolmogorov’s information theory is that the vast majority of N-digit numbers have no description shorter than just writing down the number itself. So you can find (and are indeed highly likely to stumble upon) an xyz such that “all numbers that contain xyz as a subsequence” has no description shorter than, say, 1 million bytes. Compared to pi, which can be described in ~100 bytes.

Finally, I think what Paul says still holds. The more specific you get, the less likely you are to find something that matches that description.

Your second sentence is tautological, so there’s not much to disagree with. But the specificity of a description is unrelated to the length of the description itself. That’s the point of my example above.

In Paul’s statements, a physical (philosophical?) theory describes a set of possible worlds. He claims that, as a rule, the larger the information content of the theory’s description, the smaller the size (more precisely, the measure) of the set of worlds described by the theory. But it does not follow — replace “set of possible worlds” with “set of real numbers” and the above example shows the independence of set size and (minimal) description size / information content of a description.

In his analogy with library books, Paul only considers a naive theory of information, in which every length-N description is simply of the form “the book must have character 1 in position 1, character 2 in position 2, … character N in position N”. When restricted to this simple model, the correspondence between description size and specificity is true, but this is not a realistic theory of “information”. It would be like saying that every theory of gravity must simply be a collection of data points like “when 5kg is dropped from 2m, it hits the ground after X seconds; when 4kg is dropped from 8m, it hits the ground after X seconds.” A long description of this form is of course more specific than a short one, but the length of the description doesn’t tell you how much intrinsic information there is. You realize that its information content is quite small when you notice that you could have just said “G*m1*m2/r^2″. Similarly, writing down the first billion digits of pi seems like a lot of information, but you could have just written down “ratio of circumference/diameter of a circle” instead.

  (Quote)

Mike August 24, 2010 at 8:07 am

@Hendy:

think we’re taking for granted that computers already have all of the mathematical foundations built in to use all kinds of subroutines and methods pre-hardcoded for use.

This is a reasonable objection, but Kolmogorov information theory is robust to changes in the computational model (see what I did there? I turned this philosophy blog into a theoretical computer science blog). So if you object to all these trig functions being built in, suppose you could implement all the trig you needed in N bytes of library code. Then add 100 bytes to compute pi on top of that. Ok, so you can compute pi in N+100 bytes. There are still sets of numbers that can be described as “numbers containing xyz as a subsequence”, and that cannot be described in any less than N+1000000 bytes.

No matter how you slice it, you can have any combination of {large,small} sets of objects with {high,low} information content.

  (Quote)

Hendy August 24, 2010 at 8:50 am

@Mike:

Good point, though are you sure we’re not confusing the information content of the method/description with the information content of the resultant set?

If Paul is correct, the less information in the description, the more information should be in the resultant set that matches and I think that holds for this example.

- The method for obtaining pi is N+100 bytes and contains one resultant

- I dare say that the method for creating possible numbers containing xyz can be written in just a few lines (N+[x < 100 bytes]) but the result would be far greater than one number

If this is the case, pi's method contains a lot of information and results in little information and xyz number's method contains little information but results in a set containing a lot of information. I think what you propose appears to work because you have blended the method with the resultant set.

Also, you shifted the definition to something I thought unnecessary. You have brought in “N” to describe some starting base and I’m saying that there are two bases:

N = base info needed for computing pi
M = base needed to compare numbers and see if xyz is in a set

I think that one would find N > M already.

In any case, when you say:

There are still sets of numbers that can be described as “numbers containing xyz as a subsequence”, and that cannot be described in any less than N+1000000 bytes.

I think you are proving Paul correct. The method for describing “numbers containing xyz as a subsequence” is extremely short (low information) but it results in a set containing a lot of information. But the theory Paul describes had never been concerned with housing all of the resultant possibilities and gauging their information content. It’s been about the information content in the method/description/explanation.

As such, xyz-containing-numbers can be done in far less bytes than computing pi.

  (Quote)

Mike August 24, 2010 at 9:55 am

@Hendy:

Perhaps a lot of the disagreement is moot until Paul clarifies exactly what he means, no? I’m assuming it to mean something along the lines of Kolmogorov information theory.

Good point, though are you sure we’re not confusing the information content of the method/description with the information content of the resultant set?

In information theory, the information content of an object is the length of the smallest unambiguous description of that object. In Kolmogorov theory in particular, the description is a computer program which outputs/constructs the object.

I have been carelessly interchanging “information content of an object” and “information content in the description” — I’ll be more careful, but I always mean to refer to the same thing: the length of the shortest description of the object.

The method for describing “numbers containing xyz as a subsequence” is extremely short

I just generated a random 100-digit number. So let xyz = 7264457375810908214821635893822034892556424996418858519175698531079448997597917507072346205063033451. How will you write a computer program that spits out all numbers that contain this particular xyz as a subsequence? I daresay that your program won’t be able to do much better than the obvious one that has xyz hardcoded in. That’s the essence of randomness/incompressibility of Kolmogorov theory — for most objects, their shortest description is the primitive one that simply has the object itself hard-coded and prints it out. To be sure, apart from the hard coding of xyz, there will not be much to the program. But you must account for the space required to hard-code xyz.

I feel like you are making a distinction between code and data. You only want to count the code that says “compare two subsequences” (which is not much code). This seems to be what you mean when you talk about the “bases” M & N. Is this accurate?

But code is data and data is code — it’s not a meaningful distinction. In the example above, without the hard coded data you have not unambiguously described the object in question (for the particular choice of xyz). Plus, if the rules don’t force you to count “hard-coded data”, you can cheat by saying “execute the hard coded data as if it were code”. Now you can smuggle in anything, call it hard-coded data, and you don’t have to count it.

  (Quote)

Hendy August 24, 2010 at 10:16 am

@Mike:

Good points.

First, regarding M&N. I’m simply talking about the “toolboxes” and from my peak at the pi code, it looks like one would need more data to generate pi mathematically than to compare digits. The “toolbox” is already bigger, in other words.

On to the xyz example. I don’t know how one would find a number containing xyz without working with xyz… It seems like you’re asking how I’d find numbers containing xyz if I didn’t give the program xyz to work with? But I would…

I’d do something like this (bad java example):

int xyx = 1234…n

for (int i=0; i<10^200; i++){
if i.contains(xyz)
System.out.println (i);
}

Something like that. I’d just chug through numbers from 1 -> whatever and figure out which ones have xyz somewhere inside. Short code.

I suppose the bytes to store a 100 digit number are going to come into effect so you would be right that hard coding might have it’s toll on size depending on the string to match.

To be fair, you’ve chosen a number that can be reached with formulas. In that case, apples to apples would seem to require comparing pi to a 100 digit number that can be found with a mathematical formula. Picking a prime number will, of course, require hard coding the number. In this sense, the comparison seems unfair, for there is a relatively simple system in place to keep punching out digits of pi with 100 bytes of code, bot no system to define some arbitrary number of 100+ digits without simply writing out those 100+ digits.

Maybe another way to look at it, though, is that even if it requires a lot of information to specify 100 digits hard-coded and go from there… it should for the numbers matching this 100 digit subset should be far more specific than numbers having a 3 digit subset.

In that case, Paul’s example still holds, for a more specific definition (one containing 100 digits to match) will require more information than a less specific one (one with only 3).

Obviously you are getting at the fact that the matches for a 100 digit containing number, even though specific, are still far more numerous than the 1 number matching pi which is obtainable with just 100 bytes. I’m pointing out that the two are in a different family of numbers: those defined by simple relationships and those that are definable only by “brute definition.”

  (Quote)

Mike August 24, 2010 at 11:07 am

In that case, apples to apples would seem to require comparing pi to a 100 digit number that can be found with a mathematical formula.

Distinguishing apples from non-apples is the point of appealing to information theory! Because pi has a succinct mathematical formula, the singleton set {π} has very little information. Pick a 10-billion digit number x at random and the set {z | z contains x as a substring} will have a lot of information. Yes, x is unlikely to have a succinct formula — that’s a convenient interpretation of the fact that x has high information content.

(I hope it’s not too confusing going back and forth between information content of individual numbers and sets of numbers. In Paul’s example, theories describe sets of possible universes. I’m using sets of numbers as an analog. Of course, if an object x has high information content, then the set {z | z contains x} also has high information content.)

Another analogy: Are people with shorter addresses necessarily taller than people with long addresses? Of course not; being tall is a totally unrelated property to whether or not your address is short or long. Similarly, are sets of numbers (or possible worlds) with short descriptions necessarily large sets? No, being a large or small set has very little to do with how long your [shortest] description is.

Maybe another way to look at it, though, is that even if it requires a lot of information to specify 100 digits hard-coded and go from there… it should for the numbers matching this 100 digit subset should be far more specific than numbers having a 3 digit subset.

This is what Paul’s simple library book analogy says. I agree that {z | z contains x as a substring} is a “larger” set if x has fewer digits. But the statement has very little to do with the amount of intrinsic information in such sets. Measuring “information” this way is unrealistic. Physical and philosophical theories are not simply checklists of one-to-one correspondences with the description (in our analogy world, lists of digits that have to match). They are frameworks for deriving many testable correspondences (in our analogy world, succinct formulae for deriving many relationships — generally more relationships than the length of the formula).

So if Paul really means his library example to be an entirely accurate analogy (checking off a list of one-to-one correspondences), then he is right that shorter lists admit more possible worlds. But then this “checklist” model is a very bad description of what physical/philosophical theories actually are.

On the other hand, if he really does mean to make a formal appeal to information theory, then it is no longer true that theories with smaller information content necessarily admit more possible worlds.

  (Quote)

Hendy August 24, 2010 at 12:48 pm

@Mike:

I see your points. I guess at this point I’d say that if what is needed is an exception to the rule that all low-information-possessing explanations produce large resultant sets that satisfy, then you certainly have an example.

You have a low-information-containing description which contains only one number and an example of a high/clumsy description (all numbers with xyz as a subset) that contain much more information in their resultant set.

It’d be interesting for Paul to come back through and comment on your examples to see if he would say that his analogy is only a “most of the time” theory or he sees things differently somehow…

  (Quote)

Paul Almond August 24, 2010 at 1:02 pm

Thank you to everyone for listening to the interview and posting comments. I see that there is a lot of good discussion here. I will do my best to answer what I can. The fact that I was the person interviewed could make it a bit difficult here, because I think I have some responsibility to answer what I can: On the other hand, if I answer some of the earlier comments, I will be a bit behind on the discussion which has moved on by now. Also, some of these issues would ideally be discussed in a full-length article. I think, though, that I should just start answering what I can, and that rather than being unsatisfactorily shallow about everything, I should try to make reasonably substantive responses. I won’t be able to do it all at once and ask for people’s understanding. Anyone who specifically wants me to focus on an issue or answer an objection could always ask me directly. Also, if I think that people in this discussion want me to handle it a certain way, I will do my best.

First, the issues raised by Steve Maitzen about what I said about Hume’s problem of induction.

Steve Maitzen: “Almond responded “Yes, exactly” to your claim “We actually have a good statistical reason to expect that the future will behave much like the past.”

Yes, I do think that, but one thing I want to be clear on is that I am not claiming it can be justified just on the basis of “It has happened that way before” or “statistics”.

Steve Maitzen: “But that claim answers Hume only if we have non-circular reason to expect statistical patterns to be a reliable guide for prediction.”

Steve Maitzen: “Almond’s reply faces a dilemma. Either it assumes that (a) Statistically unlikely events can’t happen, or it assumes that (b) The tautology “Statistically unlikely events are statistically unlikely to happen” gives us information to guide prediction.”

Well, I think statistically unlikely events can happen – they are just unlikely, but I think you assumed I meant that anyway. However, I don’t think the justification for that is merely “statistics”. I agree with you that (b) is just a tautology: Throwing the explanation “statistics” at this will be circular, which is why we might seem to have the problem of induction in the first place. This subject really needs an article on its own, and I think I will write one sometime. For now, I will try to give an idea of how a justification would work – and I haven’t committed this to text before: You’re getting some improvisation from me here. It will seem somewhat rough and haphazard.

I think the justification for assuming that statistics works looks a lot like a justification of Occam’s razor: The two are very closely related. The main difference is that the issue of reference class is at the center of the problem of induction: A justification of induction should really justify how we construct the reference class of possible worlds: the collection of possible worlds of which we are assuming that “our world” or “the real world” (and which you assume doesn’t matter – none of this depends on modal realism being true or false) is a member. Once we have a reference class, it is reasonable to assume that, if we eliminate all worlds that are not consistent with what we know (if we haven’t already done that), then of those worlds that remain, any of them is as likely as any other to be “our world” or “the real world”. This might seem to be an attempt to sneak statistics in by the back door, but saying that any possible world is as likely as any other to be “ours” isn’t really assuming anything: In fact, it is specifically admitting that we don’t know anything at all about where in the reference class our world is supposed to come from. If we were going to start sneaking statistics in, we would be doing something like assuming that there are some preferred worlds from the reference class – the ones where the future looks like the past – and we specifically aren’t doing that.

If our reference class is “every description of a world that can be formally expressed” then we are now in Occam’s razor territory and we can just justify Occam’s razor by talking about the amount of information in theories and the “measure across possible worlds”. Theories with less information content will, by definition, have to involve patterns that continue over time, and these will tend to be more common in the reference class of possible worlds that are consistent with what we know. I won’t go further into that aspect of things, here, as it is really an issue about Occam’s razor and justifying it, and it is getting a lot of discussion separately.

The controversial issue is going to be the idea that the reference class is “every description of a world that can be formally expressed”. When people raise the problem of induction, I think they don’t tend to see it that way. People have this idea that you have a “piece of the world” and then “another piece stuck next to it” (“next to it” meaning either in time or in space), and then another piece and so on. Someone arguing this way once used the idea of someone going along dropping M&Ms on the floor, saying that unless you assume something about the person dropping the M&Ms (and Hume is telling us we can’t just assume like that) thinking that one part of the pattern tells you about another part is a fallacy.

I think this kind of view is wrong. It is actually an assumption that the reference class is constructed in a very specific way – as if each world were constructed by an algorithm that goes through every coordinate of space-time, saying “Put a particle here, don’t put a particle here, put a particle here, and so on..” or something similar. There is nothing wrong with possible worlds having descriptions like that, but by restricting the reference class to containing worlds like that only we are making an unwarranted assumption about how it is constructed. We should simply say that:

We live in a world.
We can view the world as an object – including any history that we think it has.
That object has a formally expressible description.
That description must be a member of the set of all formally expressible world descriptions consistent with what we know about the world.

And we should make the reference class as unrestricted as possible: We shouldn’t restrict descriptions to a “stick a bit here in space-time, then a bit here, then a bit here..” approach. I think advocates of that approach are making the cognitive error of imaging things from their point of view – embedded in the world at some point in space and some moment in time, seeing events unfold literally bit by bit. The correct view for building a reference class should be that of an observer who isn’t restricted by the world’s nature, but can simply perceive it as an object: Then the issue becomes one of what the possible objects are and we are into object descriptions.

Another problem with the idea of building each world in the reference class out of “bits of space-time” is that it implicitly assumes that space and time are fundamental parts of reality. People would laugh if I suggested building each world our of “cars” or “trees”, because they are higher-level things. How do we know space-time is any different? Some people might object by saying that our experience shows that it is fundamental. I would question that, but even if it were the case, if we can’t do induction, our experience doesn’t tell us much about anything anyway. Some people might say that it does not matter: that underneath everything there must be something that is fundamental, and the worlds in the reference class must be made by sticking lots of bits of it together. But hang on! That is exactly what we are doing when we are saying that the reference class is a formal description of each world. We are effectively saying that the most basic element of reality, when we are describing it, is just a bit of information in a formal description and how this bit relates to space-time – whether it must relate to something happening at a single point in space and time, or whether it is part of a description of something extended over time, is of no concern to us: If it can be formally expressed, and correspond to a world, and it isn’t inconsistent with our world, it goes into the reference class. The idea that the description itself could be viewed as the “basic reality” may seem weird to people, but it is exactly how things should look from the perspective of a being unconstrained by the limitations of our situation as observers embedded in reality- and when we start to stick the basic description together “bit by bit” (literally bits now as we are into information), the reference class is entirely different: It becomes one in which a single bit of information can relate to something that spans time and space.

If I’m not going to be allowed to make the reference class out of “every formally expressible world-description”, another issue is that things become incoherent. Suppose we ask what reality will be like in the next instant. An anti-inductionist (not that I think there are any here: I’m sure everyone here at least accepts induction in some weak, statistical sense, as I do. I understand people are mainly saying the problem needs answering) might say that, in the next instant, matter could be in all kinds of crazy configurations. I say the anti-inductionist, in saying this, is actually betraying the fact that he has actually bought into some form of induction himself! HE EXPECTS THERE TO BE AN INSTANT AFTER THIS ONE! He is expecting reality to keep progressing instant by instant, and he may also expect space to continue being made out of point after point stuck together. Where does he get this knowledge from? His experience of reality? Hang on! Problem of induction, remember? That tells us nothing! It should be obvious that this strikes a fatal blow to the idea that we can naively construct each world in the reference class as if painting it “pixel by pixel”. Without assuming some form of induction is workable, we can’t validly a Cartesian view of reality. Therefore, we can’t demand that descriptions of possible worlds are even restricted to describing things in Cartesian terms. Therefore, we are back to just having every possible formally expressible description.

To highlight what this means, I would go as far as to say that I am not assuming that instant must always follow instant. As far as I am concerned, this world has some description, and what I know of it suggests that part of it involves instants being stuck together in some way – but I am sure you could build world descriptions in the reference class where there wasn’t even an instant after this one – where the concept had no meaning and where the description didn’t even SAY what happens “next”, but instead described lots of other things that cannot even be expressed in Cartesian terms. Ironically, it is the anti-inductionist, with his naïve, Cartesian view to world-building in the reference class, who is making all the assumptions about what a world has to do or not do. One we throw out induction, we have to throw out any assumptions of a Cartesian reality, and then when we stick bits together in the reference class they can be bits of anything – and the whole reference class of formally described worlds emerges.

Now, that would need explaining in more detail, probably: It is only the outline of an argument. It does not finish the issue. Once we have the reference class, we still have the issues of description length, etc, but that can all be covered within the context of discussing Occam’s razor.

  (Quote)

Sharkey August 24, 2010 at 1:15 pm

Mike:

I think you’re right with your criticism based on information theory. It’s possible that I’ve been reading too much topology lately, but perhaps Paul is appealing to the idea of a minimal basis from topology: a minimum set of objects that can generate the complete topology when composed under simple rules.

In that case, his library analogy holds. A small set with the right operations can generate the universe, but adding new, specific bases (i.e., more information) to the generating set will result in a topology that is either:
a) consistent but redundant, or
b) inconsistent and therefore invalid.

Under this analogy, the scientific process is the process of adding the minimal number of bases to our understanding of the world, such that the result is consistent and complete with our observations.

PS: I’ll see your theoretical CS and raise you advanced mathematics :)

  (Quote)

Mike August 24, 2010 at 1:49 pm

@Paul Almond: Thanks for clarifying your framework of information & descriptions. I’m interested to hear your justification for low information (short description) being related to higher likelihood, if you have the time.

I think your argument for induction (assuming an information-theoretic Occam’s razor) makes sense. A very succinct description of a very large object does imply at least some kind of “regularity” or “uniformity” of the large object.

@Sharkey:

PS: I’ll see your theoretical CS and raise you advanced mathematics :)

Hm, I didn’t know it was legal to “raise” to a lower amount in poker! ;)

  (Quote)

Steve Maitzen August 24, 2010 at 2:37 pm

I say the anti-inductionist, in saying this, is actually betraying the fact that he has actually bought into some form of induction himself! HE EXPECTS THERE TO BE AN INSTANT AFTER THIS ONE! He is expecting reality to keep progressing instant by instant, and he may also expect space to continue being made out of point after point stuck together.

@Paul Almond: Thanks for your comments. I can’t reply to them all, but I think there’s an uncharitable reading of Hume’s challenge in your remark quoted above. Like the rest of us, Hume relied on induction, but he argued that no noncircular justification of that reliance is possible. That stance isn’t inconsistent. Furthermore, Hume’s challenge to induction needn’t assume that the world will go on existing in order to show that we have no good reason to believe (1) that it will go on existing or (2) that it will go on as it has if it goes on existing.

  (Quote)

Paul Almond August 24, 2010 at 3:00 pm

Mike –Yes, I will try to get round to all this.

Steve Maitzen – Well, there are the other points I made as well. Also, I do not think I was being uncharitable. Suppose we say that everything disappeared, or even that [i]time itself ended as one of the options, so that after that there is nothing[/.i]. I still say that even if you say that what happens next is restricted to things like that, you are assuming a Cartesian reality: You are assuming that any weirdness that reality is going to throw at you, in the absence of any working induction method, is going to be limited to things in space-time.

This isn’t about whether or not “something is assumed to happen next” but rather it is about how the reference class gets constructed in the first place. If we assume that any “next moment” is as likely as any other “next moment” – and let’s throw “non-existence” in there as well if we think things might just end, then we are assuming that each possible world in the reference class is put together out of “Cartesian pieces”. It is the language of saying that we don’t know what will happen next that is the issue: It betrays the Cartesian assumptions behind the method of reference class construction. In fact, I think I might just call this “the Cartesian jigsaw” approach to reference class construction, where we are supposed to assume that each piece may as well be randomly selected, and if we have seen a pattern for part of the object: So what?

If you like, we can easily be charitable and assume Hume didn’t mean to use a “Cartesian jigsaw” approach. In that case, we have no reason to assume that any particular kind of description is required, and we are back in the realms of admitting all formal descriptions of worlds – which then opens the door up to information-theoretical justifications of Occam’s razor (and I admit that information-theoretical justification of Occam’s razor is going to be involved.)

  (Quote)

Paul Almond August 24, 2010 at 3:05 pm

(Sorry about the usage of [i] in that previous post.)

I’ll add something else: I’ve proposed that the reference class be every possible formally expressible world description – something which should seem liable, at least, to opening the door to a justification of Occam’s razor, and therefore clearly the failure of Hume’s argument.

We need a reference class to even start discussing issues like this. Does anyone else have one? I’ve already argued that an anti-inductionist would tend to assume a reference class of Cartesian worlds, each built out of unrelated Cartesian jigsaw pieces, and I’ve explained the problems I have with that construction method, unless someone wants to justify that method?

  (Quote)

stamati August 24, 2010 at 6:24 pm

Oh my sweet Jesus, my brain just leaked out of my nostrils.

  (Quote)

Richard Wein August 25, 2010 at 12:30 am

I think it might be worth checking that we agree on what the term “problem of induction” refers to. The SEP has a long page on the subject, and there seem to be a number of related issues, but I believe the common interpretation (and the one that people have in mind when they say it shows that “science relies on faith too”) is that we cannot justify induction, and therefore we have no reason to trust our inductive inferences.
http://plato.stanford.edu/entries/induction-problem/#IndJus

I think it’s understood (if not explicitly stated) that people who consider this a “problem” are looking for justification “all the way down”, i.e. ultimate justification. If you can justify induction by invoking some other epistemological principle (like parsimony), but you can’t justify that other principle, then you haven’t solved the problem.

But we cannot hope to justify our inductive inferences “all the way down”. Every attempt at justification must itself invoke some principle (rule) of justification, and we can always ask, “but how do you justify that principle?”. The demand for ultimate justification must lead to circularity or infinite regress, i.e. failure. Our knowledge must ultimately be based on one or more epistemological principles which we have acquired by experience but which we cannot justify.

One thing that complicates discussions of justification is that we have many other words we can use to express claims of justification. We can say that a belief or inference is “reasonable” or “rational”. We can say that we have a “reason” to believe something. We can say that a belief is “probable” or “likely” to be true. (I suggest to Paul that he clarify what he means when he makes probabilistic statements, as I feel he may be conflating different meanings of probability.)

  (Quote)

stamati August 25, 2010 at 8:52 am

@Richard

Exactly.

  (Quote)

a Nadder August 25, 2010 at 2:37 pm

Richard, but in that case I don’t see how this would be different to asking that we justify deductive logic as well, using the same all-the-way-down principles? How is that different to saying “you can’t prove to me that deductive logic works except by using deductive logic, but this is circular”?

  (Quote)

Paul Almond August 25, 2010 at 3:11 pm

I will try to deal with Mike’s comment, early in the discussion: I am sure this will get more involved as well, as I try to work my way through all these.

Mike: “Not a philosopher here, but someone who is familiar with Kolmogorov & Shannon theories of information.”

And it is indeed the Kolmogorov idea of complexity that is being used here. Incidentally, for anyone else reading this, the term “complexity” can mean different things. Rather that get into all that, I will just say I am going off information content – the length of the shortest program to describe something. It is actually completely different from my view of “complexity” in everyday life. Regarding Shannon – well it would be hard to avoid Shannon in any discussion of information anyway, so we can safely assume we won’t avoid Shannon here.

I do have a mild reservation about the assertion that [paraphrasing] “the smaller the description of a universe, the more likely a universe is to agree with it.”

Mike: “Paul uses the analogy of library books matching up, but let’s use an analogy of numbers.”

The library analogy was a simplification to try to give a very general idea of the sort of thing going on. It glosses over the whole issue of encoding systems, for a start. It was to try to give people an idea of how an argument to justify Occam’s razor would work in an interview with limited time. I would not expect the library story to be the actual argument. Nevertheless, I accept that the ideas in that analogy still need defending.

Mike: “If I pick a completely random real number, then its shortest description will not be any shorter than simply writing down the entire number itself.”

Agreed. If you could shorten the description, there would be some pattern to it. In fact, that gives us a good definition of “random”. However, we don’t seem to live in a universe like that: We seem to live in one with some regularity. Our experimental observations so far don’t look like a sequence of random numbers.

Mike: “This would be one of those “weird” universes where the sun turns into a tree. Note that I could make a description also that says “any number that contains _____ as a subsequence”. What I just wrote would be a long description that matches infinitely many numbers.”

I disagree here. In principle “one of those ‘weird’ universes where the sun turns into a tree” could be an example of such a universe, but it would be a very atypical example. Most (not all) of those universes wouldn’t have anything as well-organized as suns, sunrises or trees – which are abruptly replaced by a chaotic mess: They would be a chaotic mess from the start. Now, of course there would be some universes like this where the universe just happens to “get lucky” and contain things like sunrises before it abruptly starts doing something else, but I would suggest that the information needed to describe a universe that does that is much greater than the information you need to describe a universe where the sun just keeps rising normally – or where some kind of normal operation of physics continues. Universes where physics works normally for a while and then keeps working would be represented much more.

Mike: “On the other hand, consider the number pi. You might think that pi contains a lot of information, since it is irrational and its digits never repeat. However, I could describe a pi with a very small amount of information (in fact, computer programs to generate pi can be less than 100 bytes).”

I accept all that.

Mike: “By analogy, this is one of the “simple” or “regular” universes. But this short description doesn’t make it any more likely that a number will match the description. The description of pi is very short, but it is still completely exclusive — only one number matches the description.”

One thing to make clear here: I want to make sure that there is no confusion between the basic description of a universe and the description in a theory. The description in a theory is a partial model. It is an algorithm intended to predict experimental results and it is not reality itself. I’ve been arguing here that theories (partial models) should have minimal information content, but that shouldn’t be confused with any idea that reality itself is expected to have minimal information content. In fact, there might be a case for thinking that any universe’s description is an endless sequence of bits! Suppose we consider all universes with descriptions of 1,000,000 bits length. There are 2^1000,000 of them. Now suppose we consider all universes with descriptions of 1,000,000,000,000 bits length: There are 2^1,000,000,000,000 of them. This might suggest that, all else being equal, the universe is more likely to have a 1,000,000,000,000 bit length than a 1,000,000 bit length, but we could say this for any description length: No matter how long the description, you could find many more universes with a longer description, so can always justify thinking that there are more description bits. Partial models (theories) should be simple, but this is completely different from any idea that the universe itself needs to be simple, so I have issues with the idea of “one of the ‘simple’ or ‘regular’ universes” here.

Mike: “So from the perspective of information theory, I don’t agree that a shorter, “simpler” description necessarily is less specific / more inclusive.”

But I think this is still looking at the basic description itself, rather than the partial model.

I say that a shorter, “simpler” (using a very specific meaning of the word “simple”) description is likely to be less specific / more inclusive.

Suppose we see some experimental results that suggest some pattern to us. Let’s use your pi example here: Suppose we do an experiment and we get a sequence of results, each result being a digit in pi, so the experiment seems to be generating pi as it is going along. After enough digits, it will need a lot more information just to state each digit individually than it would just to describe this sequence using the (short) algorithm to generate pi. Suppose now we consider all universes with n bits or less in their complete descriptions. Let’s consider all the ones where each digit is described individually in the description. A lot of information is needed to do this, and we only have n bits available. Once we have used up all the bits we need to get these digits, we only have the bits that are left available to allow universes to be different, so if we use up 1,000,000 bits in accounting for these experimental results, it only leaves n-1,000,000 bits available to change around to make different universes. This is just the specificity of description issue again – but looked at another way round. If we only need a small amount of information to account for the experimental results, say 1,000 bits, this leaves n-1,000 bits available to get different universes – so you are going to get more universes when you don’t waste bits out of your limited supply in accounting for this experiment. And before anyone says why not just use more than n bits? – We can use this argument for any n. We can use this to show that the shorter description gets more of the universes for n=1,000,000, for n=1,000,001, for n=1,000,002, for n=1,000,0003, etc. For every value of n, when n is large enough to be statistically significant, the universes where you don’t waste bits on accounting for things are going to dominate.

(One response someone might have, which complicates things a bit, is that, as n gets very large, it becomes practically certain that the information to generate pi will be somewhere in it anyway: If n=1000,000^1,000,000 for example, we might think that a description of pi is almost certain to be in there somewhere, making any attempt to distinguish between universes meaningless. This need not concern us, because it is not just about whether the information is in the description somewhere: It has to be in there in such a way as to cause those specific results in that specific experiment.)

Mike: “Perhaps my problem is an artifact of a simplified, informal exposition for the benefit of the non-specialist audience, so I’m open to having the formal/mathematical details clarified.”

It looks to me as though that may be the case.

Leomar: “…as I’m from a non-english country sometimes interpretation of the pronunciation gets in my way..”

It is more likely due to my highly-regional English accent!

Chris K: “I agree that the idea that God is beyond logic is a pretty absurd claim, and I wonder, who exactly is claiming such things?”

Muto: “Quite a few people I know argue this way.
However, professional philosophers seem to avoid this line of argumentation.”

I agree with Muto, here. This is the kind of thing you read/here a lot of if you hang around in chatrooms. Some theists seem to think it is a kind of nuclear weapon level argument that destroys any argument. You don’t tend to get it often from apologists in universities.

TaiChi: “I’d like to second Mike’s comment – I too am not sure why a smaller description is more likely. In fact, since the shorter descriptions under discussion were scientific theories which manage to generate highly specific predictions over broad swathes of phenomena, I would’ve thought that these would be incredibly unlikely, a priori.”

A shorter description isn’t more likely. None of this even means that organization and patterns in nature are more likely: In fact the most likely thing would seem to be a mess, if we measure “likely” in terms of well-respresented things are in the set of all possible universes. The problem is that when we start applying Occam’s razor, we have already seen the patterns – so we are already being confronted by the specificity. Using the pi example again, if an experiment were to generate lots of digits of pi, it would not help us to say that specificity that generates digits of pi is unlikely: We are looking at it. The issue now is how to deal with it as economically – with as little specificity as possible. You might think that efficient algorithms that do this are too specific to be plausible, but you need even more specificity than that to just get all the digits generated individually.

Mark: “Luke, did you discuss theists’ argument that God actually is extremely simple, because his properties all flow from the simple description “having all perfections?”

I would suggest that we got quite close to something like that in the part of the discussion where Luke said, “What I hear you saying is that either the theologian is going to say that God is very simple and God did it and that’s all the explanation there is. But in that case, that’s really just a woo-woo explanation. It’s a word, and there’s no predictive power to the explanation at all. It’s not even worth calling it an explanation. Or they could actually turn the God hypothesis into a predictive model and make it into an actual explanation.”

Stamati: “Tai Chi, I think the point is not the amount of information or specificity of a theory’s prediction, but the amount of information in the theory itself.”

Yes, and your Kepler analogy is a good one.

Mike: “Why should the universe favor simple or elegant theories? What’s stopping physics from being very convoluted to describe?”

Because either:

a) the more specific a theory is, the smaller the proportion of universes in your set of possible universes that will comply with it

or

b) the more information there is in a theory, the more bits you waste in describing a universe, with a description length of n bits, that complies with that theory. The more bits you waste, the fewer bits you have left over to make universes different, meaning you must be talking about fewer members of the set – and this applies for every large n.

and these are actually the same issue, really – just looked at in a different way.

Mike: “A Ptolemaic would object to Kepler because circles are more fundamental and “perfect” than ellipses, so dammit circles had better be involved in planetary motion. Here it was wrong to assume the universe obeyed some kind of standard of “elegance” — was it the principle of elegance that was not well-founded, or were we just measuring elegance in the wrong way?”

But we aren’t really trying to tell the universe what standards to obey. The full description of the universe is something that is going to be beyond us – and it can take any form at all. What needs to obey standards is any partial model we construct – and the justification for this is statistical.

Richard Weiss: “Ultimately, we cannot avoid the need to put some faith in our own cognitive abilities. So the claim that “science requires faith too” is correct to that extent.”

I appreciate that problem. I think the issue of “Are my own cognitive processes trustworthy?” is a bit different from how the problem of induction is usually understood: I think Hume’s problem as generally raised by people and understood by people is a bit more specific: Even if you knew you were sane – let’s say your mind were the only part of reality you knew you could trust, Hume’s problem would still be considered a problem for many people. I understand the problem of proving your sanity and won’t attempt a solution here. Incidentally, there is some interesting discussion of it in “Godel, Escher Bach: An Eternal Golden Braid” by Douglas R. Hofstadter – but I suspect you may have already read it. In any case, I have rarely encountered the problem of induction being raised in “mental” terms in that way. I have occasionally encountered in a rather pointless form which says that we can’t make any predictions with certainty (and I actually agree with that – we can’t) and another form which says we can’t even make probabilistic predictions with any justification – which I disagree with.

  (Quote)

Paul Almond August 25, 2010 at 3:18 pm

I was asked about various methods of theory evaluation in the above comments, and I will just quickly answer this too by saying, “Yes, I do know of them.” I have been taking a very simplified view here: In the real world, we’ll have issues like different levels of agreement with the data (and I think AIC was specifically mentioned which attempts to address that issue)- and then we’ll be into more involved approaches. I think, however, that bit-length of the theory as a general kind of idea makes sense.

  (Quote)

lukeprog August 25, 2010 at 4:30 pm

Epic response, Paul Almond!

  (Quote)

Richard Wein August 26, 2010 at 4:16 am

@aNadder:

Richard, but in that case I don’t see how this would be different to asking that we justify deductive logic as well, using the same all-the-way-down principles? How is that different to saying “you can’t prove to me that deductive logic works except by using deductive logic, but this is circular”?

I can see now that my “all the way down” was rather vague. The problem of induction is to justify induction without using induction (which would be circular). The only option that leaves is to use a deductive argument from premises that can be reached without induction. That limits the premises to observations from direct experience. Deductive logic itself is taken for granted, as is the truth of our direct observations. So “justification all the way down” should be interpreted as justification by a deductive argument from direct observations. And that’s impossible.

If you weren’t already familar with the problem of induction you may find this page more helpful than my terse explanation:
http://www.princeton.edu/~grosen/puc/phi203/induction.html

To answer your question, I think we can give an inductive justification for our use of deductive logic: “it’s worked in the past”. But no one seems to feel the need for such a justification. Deductive logic just seems so obviously right once you’re familiar with it. On the other hand, inductive reasoning seems more suspect, perhaps because we experience it as being more fallible and more subject to disagreement among competent thinkers.

  (Quote)

Mike August 26, 2010 at 9:49 am

@Paul Almond:

Thanks for the lengthy and detailed reply.

Here is your argument, as I understand it:

1) Suppose the description size of the universe is N bits
2) Consider a small theory whose description size is S and a comparable large theory whose description size is L, with L >> S.
3) There are more N-bit universes that entail theory S than entail theory L — namely 2^{N-S} >> 2^{N-L}

Now I see your argument, and my reservations are much more minor. I no longer think you are misusing or abusing Kolmogorov theory. ;)

If we want a one-to-one correspondence between universes and descriptions (because we are doing a counting argument on descriptions as bit-strings), we must chose a single canonical description for each universe among all its eqivalent descriptions. Otherwise our counting would unjustifiably favor universes that have a lot of equivalent descriptions in our encoding. This presents a significant problem: not every description will be a canonical one.

Take the example you gave (building off my pi example):

Suppose now we consider all universes with n bits or less in their complete descriptions. Let’s consider all the ones where each digit [of pi] is described individually in the description. A lot of information is needed to do this, and we only have n bits available. … you are going to get more universes when you don’t waste bits out of your limited supply in accounting for this experiment.

A natural choice for canonicity is to always chose the shortest (lexicographically first) among the universe’s descriptions. If we really are dealing with the correct digits of pi, then none of the descriptions of the first kind (those that encode a lot of digits of pi explicitly) would be minimal/canonical — you could always replace those explicit pi digits with a more compact formula. The argument counting the number of descriptions then breaks down.

In this case, the short theorem still “wins”, so maybe it’s not a problem. But it is still unfair to compare two theories’ compatibility with descriptions when, by the rules, one theory simply cannot ever be part of a canonical description.

My next concern might be addressed by an appropriate use of measure theory, of which I am no expert. You’ve limited the comparison to among only N-bit universe descriptions. Why are N-bit descriptions the only candidates we are comparing against? This space of possible universes seems very much an artifact of how universes are encoded into strings, and may or may not reflect any kind of intrinsic “even playing field” on which to compare universes. For example, if you compare among finite-length bit strings of any length, then the difference between a long and short theory doesn’t matter — there are a countable infinity of descriptions compatible with both theories (i.e., the same “number” compatible with both). This is where an appropriate measure theory tool might save you (to distinguish between two numerically equal infinities), I’m not sure.

Cheers, and thanks again for your response!

  (Quote)

a Nadder August 26, 2010 at 1:00 pm

I definitely knew about the problem of induction, I’ve just never seen it as a problem! By all the way down I meant justifying X in terms of something other than X. My point was that if this is a problem for inductive reasoning, why isn’t it a problem for deductive logic (ie. that you can’t prove deductive logic without using deductive logic). Your answer seems to hinge on deductive logic being intuitively obvious, but then if I have the same intuition about induction then the problem would also disappear.

I also see the point about inductive reasoning seeming to be more fallible, but then we might distinguish between two types of inference:
1. X follows Y –> X will follow Y
2. X follows Y and I have a mechanism for why this happens –> X will follow Y.

I would think if you’re using 2 (as a basic rule that appears with justification), you will do just as well as with deductive reasoning (which also seems to fail us all the time in its application)

  (Quote)

Richard Wein August 27, 2010 at 12:33 am

@aNadder

All I’m doing is pointing out that induction can’t be justified, and therefore (among other reasons) even the findings of science are to some extent a matter of faith (i.e. belief without complete justification). Whether you label that a “problem” is not particularly important, but it does seem to make people feel uncomfortable.

As I said, I think we can give a vague inductive argument for the effectiveness of deductive logic (“it’s been effective in the past”), so deduction is not in quite the same boat as induction. But I’m not saying that once we accept induction there are no further gaps. If you refer back to my first post in this thread, you’ll see I argued that the inability to justify induction is not the only gap in our justifications.

If you ask me what would constitute a complete justification I don’t think I can give you an answer, because I think that concept is incoherent. But I think many people feel a yearning for a deductive argument starting from direct observations, and they might call that complete. The problem of induction tells us that we can’t even have that.

You may say it’s obvious our justifications cannot be complete, for other reasons, so there’s nothing very significant about the problem of induction. I wouldn’t disagree with you. But it was Paul who brought up the subject, and seemed to be claiming that he could justify induction. (Perhaps I misunderstood him, but if so I’m not the only one.) I’m arguing that he’s wrong about that.

  (Quote)

Bram van Dijk August 30, 2010 at 12:48 am

I didn’t read all the comments, so this may have been mentioned before.

I’m not sure whether simplicity is as important for a scientific hypothesis as Paul seems to think. For example, quantum mechanics is more complicated than Newtonian mechanics. Still we prefer quantum mechanics because it offers a better description of reality.

If you look into the philosophy of science (which I am not very familiar with) then the instrumentalist/realist debate seems relevant. Some see scientific theories merely as an instrument for predicition, others want them to be accurarte descriptions of reality. But simplicity doesn’t seem to be very high on the wishlist for a theory or explanation.

  (Quote)

Liam September 7, 2010 at 10:11 pm

Potentially great interview ruined by world’s most annoying accent. Thanks for including the transcript so I can read it instead.

  (Quote)

Leave a Comment