Introduction to Artificial Intelligence (index)

by Luke Muehlhauser on March 8, 2011 in Intro to AI

I think artificial intelligence (AI) is one of the most important fields you must understand if you want to do philosophy well.

So, I’m writing a post series summarizing the leading textbook in the field – Russell & Norvig’s Artificial Intelligence: A Modern Approach – at least, I’ll be summarizing the parts of it that are rather immediately useful to philosophers. I won’t spend time explaining in depth how different search algorithms work, for example.

Why do I think AI is so important for doing good philosophy? Hopefully, my reasons will become clear as the series develops. For now, let me repeat a story I’ve told before.

After giving a talk on computers at Princeton in 1948, John von Neumann was met with an audience member who insisted that a “mere machine” could never really think. Von Neumann’s immortal reply was:

You insist that there is something a machine cannot do. If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that!1

The problem with most philosophy is that it is imprecise, and this leads to centuries of confusion. Do numbers exist? Depends what you mean by “exist.” Is the God hypothesis simple? Depends what you mean by “simple.” Can we choose our own actions? Depends what you mean by “can” and “choose.”

Many philosophers try to be precise about such things, but they rarely reach mathematical precision. On the other hand, artificial intelligence (AI) researchers and other computer scientists have to figure out how to teach these concepts to a computer, so they must be 100% precise.

What does it mean, precisely, to say that one hypothesis is simpler than another? The answer (lower Kolmogorov complexity) came not from philosophy, but from computer science. What does it mean, precisely, to proportion one’s beliefs to the evidence? The answer (Bayes’ Rule) came not from philosophy but from mathematics, and especially from implementations of Bayes’ Rule in AI (Bayesian networks). What does it mean, precisely, to say that one thing causes another? Once again, the answer (Pearl’s counterfactual account) came from computer science.

Basically, my point is this: Learning to think like an artificial intelligence researcher thinks can protect you from fooling yourself with slippery words and empty metaphysics. When you think about how you would program a philosophical concept or method into an AI, some concepts and methods may immediately stand out as confused and incoherent – to be avoided.

As Daniel Dennett said, “AI makes philosophy honest.”2

Contents of this series:

  1. What is AI?
  2. Early Foundations of AI
  3. A Brief History

  1. Quoted in Jaynes, Probability Theory, page 7. []
  2. “Computers as Prostheses for the Imagination,” a talk presented at the International Computers and Philosophy Conference, Laval, France, May 3, 2006. []

Previous post:

Next post:

{ 24 comments… read them below or add one }

Taranu March 8, 2011 at 7:58 am

I’m going to enjoy this series!

  (Quote)

daniel March 8, 2011 at 10:40 am

Luke,
You’ve mentioned many times in the past that you don’t really care for conceptual analysis in philosophy. You say that philosophers are on a misguided quest to come up with a super-dictionary of terms. Then you say that concepts need to be precise and exact, so much so that they can be programmed into a computer. If we want computers to use words correctly, as opposed to how humans use words, then shouldn’t engaging in conceptual analysis actually be useful for AI research?

  (Quote)

Thomas March 8, 2011 at 11:27 am

The problem with most philosophy is that it is imprecise, and this leads to centuries of confusion. Do numbers exist? Depends what you mean by “exist.” Is the God hypothesis simple? Depends what you mean by “simple.” Can we choose our own actions? Depends what you mean by “can” and “choose.”

I very much admire your learning and intelligence, Luke, but I think that you are drawing too big and unwarranted conclusions from few true premises. Yes, sometimes language is very imprecise, but that hardly makes metaphysics “empty”. Of course this a huge topic and your negative attitude to metaphysics and philsophy in general stems from your epistemology, which is another huge topic. Maybe you can get Peter van Inwagen´s paper ‘The New Anti-Metaphysicians’ in your hands somewhere and read it. That´s my recommendation to you.

  (Quote)

Jacopo March 8, 2011 at 1:33 pm

That sounds like an interesting paper, Thomas – would you be able to upload it to a file-sharing website?

  (Quote)

Thomas March 8, 2011 at 2:22 pm

Jacobo,

ok. http://dl.dropbox.com/u/22945286/vanInwagen.doc

Reference: Peter van Inwagen, ‘The New Anti-Metaphysicians’, PROCEEDINGS AND ADDRESSES OF THE AMERICAN PHILOSOPHICAL ASSOCIATION Vol 83 no 2(2009) , pp. 45-61 (APA Central Division Presidential Address)

  (Quote)

Jacopo March 8, 2011 at 4:07 pm

Thanks very much for that Thomas!

  (Quote)

Luke Muehlhauser March 8, 2011 at 7:16 pm

daniel,

Yes, but not the kind of conceptual analysis that philosophers usually practice.

  (Quote)

Leon March 8, 2011 at 10:42 pm

You insist that there is something a machine cannot do. If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that!

As I said last time you quoted this: write a program which, given another program and a finite input for that program, determines if the other program will halt on the given input or loop forever.

What does it mean, precisely, to proportion one’s beliefs to the evidence?

The answer you give is true for machines, but not necessarily for people, for whom propositional attidutes are pretty obviously more complicated and fuzzy than a number between 0 and 1 (see: the Churchlands).

  (Quote)

MarkD March 9, 2011 at 12:00 am

@Leon:

How about if I loosen the requirement and provide a program that can get bored with extraordinarily long-running and inconclusive outcomes? Say this program has a probability of halting proportional to the depth of the parse tree for any given context free or better language? That always seemed much more intelligibly human-like than the provable corner cases that are at the heart of computability theory. That kind of behavior also washes out the loopy arguments of the Penroses and Hofstadters substituting a kind of insouciant oracular insight about what matters and does not matter in intelligent behavior.

But your point about von Neumann is well taken. I’ll give him a break since it was 1948, however.

  (Quote)

Luke Muehlhauser March 9, 2011 at 12:11 am

Leon,

Agreed on both points.

  (Quote)

Tarun March 9, 2011 at 10:20 am

One of the things a machine cannot do is compute the Kolmogorov complexity of an arbitrary string. So in this case at least, your example of a precisification of a philosophical concept does not stem from the AI motivation of teaching a computer the concept. We can’t teach computers to judge which hypothesis is simpler based on Kolmogorov complexity.

In any case, I don’t think it is true that a concept must be 100% precise in order to teach it to a computer. It seems to me that any sufficiently advanced AI will learn concepts more or less the way we do, by exposure, and not by having the concepts pre-programmed into it. Let’s say I want to build a neural net that can look at a photograph and recognize shadows. It would be foolish of me to try to find a precise characterization of what counts as a shadow and then programming it into the machine. Instead, I should program some error-correction mechanism into the machine and then train it on a bunch of photographs, correcting it whenever it does not accurately recognize shadows. This does not require a precise theory of shadows.

Learning to think like a (good) AI researcher is learning to think about how to construct a machine that can adequately deploy a concept in the range of circumstances it is likely to encounter. This does not always require precisification of concepts. Often heuristic techniques will be adequate. There is no demand for a proof that the algorithm accurately tracks the concept for any input. Much contemporary philosophy demands a higher standard of precision than this. We are not satisfied with heuristics. We want exceptionless theories of our concepts.

I tend to agree with you that philosophers need to think like AI researchers, but I don’t think this is because of the greater precision involved in AI research. It is because AI researchers focus on what makes a concept useful, what work it does, rather than coming up with an analysis that will work precisely for all conceivable situations. It is the de-emphasis of precise conceptual analysis and the emphasis on pragmatics that philosophers need to absorb from the computer science community.

  (Quote)

Taranu March 9, 2011 at 10:24 am

Does this book by Russell & Norvig that you want to review have more than 1000 pages or am I misinformed?

  (Quote)

Tarun March 9, 2011 at 10:40 am

I think you’re a little too hard on philosophy here. At least one of your examples of a good precisification – the Bayes nets approach to causation – didn’t come solely from computer science. Philosophers (Spirtes, Glymour, Scheines, Woodward, Hitchcock) have also contributed significantly to the development of this approach.

The same is true for Bayesian epistemology. Richard Jeffrey has made very important contributions here, as have Fitelson, Joyce, Syrms and Glymour, among others. As evidence that the mathematicians don’t always get it right, check out Jaynes’ uninformed dismissal of Jeffrey conditionalization in his big Bayes book.

  (Quote)

antiplastic March 9, 2011 at 12:21 pm

Yikes, again with the self-loathing philosophizing.

If a formulation is mathematically precise, then of course it “comes from math”; you’ve simply defined a level of precision such that anything that meets it can’t be anything else. But as the above commenter pointed out, it’s not as though these notions in maths and CS sprang into existence fully formed from the forehead of Zeus. Would we even be talking about theoretic simplicity or belief-updating in the absence of the people who got us thinking about them in the first place?

It might also behoove one to consider that perhaps kindergarten teachers are better at enforcing discipline over their wards than police officers in South Central L.A. are over theirs, not because of any innate moral virtue or superior methodology, but rather because the subject, scope, and difficulty of their task is different.

As far as I can see, every expert community strives for precision. Lawyers strive for it, screenwriters strive for it, aerospace engineers strive for it. If someone just wants to say they personally find the intellectual climate of (this weird fringe movement of) computer science more amenable to the sort of thoughts they like to think, then great! Some people feel more emotionally at home at football practice than chess club, and some people prefer the rote predictability of a bureaucratic job to an internet startup. Let a thousand flowers bloom. But the notion that math or computer science alone possesses some unique virtue to dictate norms of rationality to the culture at large is silly.

  (Quote)

Luke Muehlhauser March 9, 2011 at 2:27 pm

It does, but I’ll only be covering tiny portions of it.

  (Quote)

Luke Muehlhauser March 9, 2011 at 2:28 pm

Tarun,

I know it’s a bit arbitrary, but the people doing that I’m placing as much in the purely ‘mathematics’ group as in ‘philosophy.’ Suffice it to say that what people like Jeffrey and Fitelson do on that front is closer to computer science than to traditional modes of philosophy, which is why it’s such good philosophy compares to most philosophy. :)

  (Quote)

Luke Muehlhauser March 9, 2011 at 2:42 pm

Tarun,

BTW I would love to read a knowledgable refutation of Jaynes’ coverage of Jeffrey conditionalization. Has such a thing been written, to your knowledge?

  (Quote)

Tarun March 9, 2011 at 3:41 pm

Luke,

I’m not aware of any published refutation of Jaynes’ treatment of Jeffrey conditionalization. I doubt there is because it’s a pretty small point. I only mentioned it because I have recently been reading Jaynes and that section was on my mind.

Jaynes doesn’t make a mathematical error (he very rarely does). He just misinterprets Jeffrey’s claim. Jaynes criticizes Jeffrey conditionalization as ad hoc, saying that it does not follow from the rules of probability theory. However, he ignores a crucial qualification that Jeffrey makes: Jeffrey conditionalization is a valid rule of inference only when the conditional probability of the hypothesis given the evidence is invariant. Once this qualification is made, Jeffrey conditionalization does indeed follow from the rules of probability theory.

Jaynes’ oversight here is particularly problematic since he relies on the exact same invariance condition. Without invariance, even Bayesian conditionalization does not follow from the rules of probability theory.

  (Quote)

Tarun March 9, 2011 at 3:48 pm

Okay, I just pulled out Jaynes’ book and looked at that section again. After criticizing Jeffrey, Jaynes goes on to formulate a condition that is (as far as I can tell) implied by invariance, and then he says Jeffrey conditionalization will work provided this condition is satisfied. But then he isn’t really disagreeing with Jeffrey at all!

But now that I think about it, I might be being unfair here. The only book of Jeffrey’s that I’ve read is his most recent one, which came out after Jaynes wrote this. It is entirely possible that in his early work Jeffrey didn’t specify the invariance condition, and only added it in response to Jaynes (and others).

  (Quote)

Luke Muehlhauser March 9, 2011 at 4:05 pm

Huh. Thanks, Tarun!

  (Quote)

Tarun March 9, 2011 at 4:05 pm

No. I just checked out Jeffrey’s 1983 edition of The Logic of Decision and he quite clearly specifies the invariance condition. Jaynes is apparently not always a charitable reader.

In case the condition is not clear, btw, here’s what it says. If P1 is my old probability distribution, P2 is my updated distribution and H and E are my hypothesis and evidence (respectively), then

P2(H|E) = P1(H|E) and P2(H|~E) = P1(H|~E)

If this condition holds then the Jeffrey conditionalization rule

P2(H) = P1(H|E).P2(E) + P1(H|~E).P2(~E)

follows from the theorem of total probability

P2(H) = P2(H|E).P2(E) + P2(H|~E).P2(~E).

  (Quote)

Luke Muehlhauser March 9, 2011 at 8:08 pm

Tarun,

Cool. I was gonna ask you to check, but thought that was too presumptuous to suggest, but then you did it anyway. :)

  (Quote)

moreLytes August 23, 2011 at 7:56 pm

Luke,

Do you have plans on completing this series? I am interested in leveraging your thoughts on the textbook during the following public-access Stanford Intro to AI class this fall:

http://www.ai-class.com/

Thanks!

  (Quote)

Luke Muehlhauser August 25, 2011 at 1:56 am

moreLytes,

Alas, I don’t plan to continue the series. No time.

  (Quote)

Leave a Comment

{ 1 trackback }