In the previous lecture, in the previous presentation, we talked about what it means for a problem to be difficult, to be hard and we looked at a couple of examples of ways in which a problem could be hard. A problem might be hard because we don't even know how to begin to solve it and we're not even sure that it's a well-defined problem in some cases or a problem may be hard because it's provably impossible, which in most cases is a very good thing to know. We might as well not even start trying to solve this problem, because we know it's impossible. So in this talk, what I would like to do is expand a little bit on this notion of difficulty or hardness. There are other ways in which a problem can be hard and I'll mention a couple of them, just genres of problems that are difficult and why they're difficult. So sometimes problems are difficult for almost reasons that have to do with the nature of space and time. We don't know about things that for example are going to happen in the future and sometimes we don't know about things that might have happened in the past. Perhaps we'll find out at some future time. So here's some examples. These are problems that involve some notion of uncertainty. We might be able to take good or bad guesses. There might be ways of approaching the problem. They're not necessarily ill-defined like the first type of problem. There might be ways we could go about answering in a better or worse way. But the answer we can give, is always finished with a certain amount of uncertainty or levels of confidence or other notions that come up often in computer programming in ways that involve probability or other formalisms to deal with uncertainty. Here are some examples. Was there life on Mars at some past time? Now, at the moment, at the time that I'm speaking now, we don't quite have sufficient information to answer that problem yes or no. We can't give it a totally certain answer one way or the other given the evidence that we have in hand now. We could. Perhaps we could solve the problem or partially solve the problem by saying, "Here's the evidence that we have that suggests that there might have been or that there could have been life on Mars at some past-time," and we might even come back with an answer that says, "In our view, it is probable that there was life on Mars at some past time." Again, we would have to be much clearer about the notion of probable and we'd have to explain what we mean by saying that the answer to this question is probably yes or probably no. So we'd have to be much clearer about the notion of probability, but we could begin to answer a question like this. The first question about Mars is a question about the past. It's information that we have in hand and maybe someday we'll get more information that will actually resolve the problem one way or the other. The second question has to do with the future event. If you and of course computers in fact do tackle questions like this betting or making predictions about future events. Let's take a prediction from the realm of baseball. Will the Red Sox win the pennant this year? It's a source of endless conversations at the beginning of every baseball season, at least in Boston and the similar conversations go on all over the country at the beginning of the baseball season. Well, why do we even talk about this? We don't know and it didn't happen in the past. So this is all a matter of prediction, but there are better and worse predictions. Some programs and some people are much more thoughtful about the predictions that they make and their predictions have higher quality than others. So again it's not unreasonable to ask a question like this and say how would we go about answering it. It's inherently difficult because it involves future events. We can't say the Red Sox will, yes, win the pennant this year or no they will not. It hasn't happened yet. What we again, we can talk in terms of probability or levels of confidence or things like that to talk about future events. It's another way a problem can be difficult. Sometimes, a problem is difficult because we have to make a judgment and perhaps in real time. Think about that third question. Is this object a threat? Is this person a threat? Is this animal a threat? It's a question that we as human beings might have to answer in extraordinary circumstances. We run across somebody or perhaps a stray dog in the street. Is this stray dog dangerous or is he friendly? That's a question that we might have to answer and we might have reason to answer it one way or another, and we could give reasons for that. But again the answer is always going to have a certain level of uncertainty. First of all, even the notion of threat has some uncertainty. If maybe this person or animal is a mild threat or a severe threat, terms like mild and severe are qualifications that make a problem like this. Again, inherently difficult because we can't answer with certainty, but we can answer with some description of uncertainty or probability, and finally, the sentence at the bottom there is that shape which I typed with the zero on the keyboard. Is that shape an ellipse? Is it close to an ellipse? Again, it's an interesting question. In some sense, the answer has to be no. An ellipse is a one-dimensional figure and even if we think of that zero shape on the screen, it's a two-dimensional thing. It has thickness to it. So in some sense the answer is absolutely no. That shape is not an ellipse, but then again if we take that point of view seriously, we've never actually seen an ellipse because you've never seen a one dimensional figure. So intuitively, we looked at that question and think, well, yeah, for our purposes, it's an ellipse. Even if mathematically it doesn't follow the perfect form of an ellipse, it looks close enough so that we could answer yes. Again this proves to be in some ways a difficult problem. It's difficult, because to answer yes or no requires additional language that we have to invent or formalize or elaborate on. So all of these are problems that can be thought of as difficult, because we need additional tools to talk about them. Let's go back. In the previous talk, we mentioned a couple of problems that we as people seem to solve in a fairly natural way and by the time we're young children, we are able to solve them reliably, the vision problem and the language learning problem. Now expressed in formal terms and these may not entirely reflect the problems that we as humans have to solve but expressed in formal terms, both these problems are in fact impossible. So I'm going to express these problems in terms that may be a little unfair to human beings, not quite the way we encounter these problems. But let's just put them in the most stark terms. In the case of vision, if you recall, the question was, can you recover three-dimensions from two? Can you take a pattern of light on the retina and recover from that, reliably the arrangement of three-dimensional objects and the outside world? The answer is no. That's impossible because you can't recover three-dimensions from two. In fact, in general, in mathematics, you can't recover n dimensions from n minus one dimensions. So this is just a special case of that. We can't get three-dimensional structure from a two-dimensional array of light. Similarly, let us take a very strong and again almost certainly unfair version of the language problem. But given a set of sentences, the baby is born. He or she is listening to their native language. Given a set of sentences, can we determine the structure of the language that produced those sentences? Even here I'm going to be fairly specific about it. Let us say that the language that we're seeking has the structure of what's called a context-free grammar. Now English there are all arguments about this. I'm not arguing that English or any natural language does in fact have that structure. But for the sake of argument, for now let's say that we are trying to learn a language which has the structure of what in formal linguistics is called a context-free grammar. It's a relatively simple structure of a natural language. So given a set of sentences, can we determine the context-free grammar that produced those sentences? Again, the answer is, this is impossible. It could be an infinite number. Given a finite number of sentences, and that's all we can ever hear, there could be an infinite number of context-free grammars that produced those sentences. Now, again, this is not in fact a perfect reflection. Maybe it's not even a very good reflection of the language learning problem as we encounter it. But from a formal computer science standpoint it's a start. Could you get a computer to sit on the desk, listen to a bunch of sentences, and find out from that the grammar that generated those sentences? The answer is no, it's impossible. Well, so are we back to something like the perpetual motion problem? These are impossibilities. But they're problems that are of such importance to us, such evolutionary importance to us as animals that we have to solve them as best we can. So to take the vision problem as an example, we may not be able to with other certainty recover three-dimensional structure from two. We may not be able to do that. But we can make very good guesses. In fact, the human vision system is extraordinarily good. Animal vision systems in general are extraordinarily good at recovering three-dimensional structure from 2D patterns of light. So in order to solve the problem, we have to accept that we're not going to solve it with 100 percent certainty, but we can solve it to the best of our ability using all kinds of additional heuristics. Those are rules of thumb, and guesswork, and past knowledge, and a whole variety of other things that come into play in order to see reliably. We know that we can't do it with 100 percent certainty because of things like optical illusions. Optical illusions are fun and interesting precisely because they indicate to us the limitations of our own ability to see. We see some pattern and we interpret it in one way, and then through other information we find out no, we have misinterpreted this image. It is in fact something else. That's interesting. The reason we're able to misinterpret images is because our vision systems make smart guesses and use heuristics and so forth. If they somehow could solve the problem with 100 percent certainty, we would experience no optical illusions, but we do precisely because of the algorithmic limitations of our own vision systems. There are some problems from the computer scientists standpoint which aren't hard. They're easy problems. Computer scientists think of them as easy problems and they're not always the easiest problems for people for one reason or another, but sometimes there's an overlap. So here are some examples. Given a number, a positive number find its square root. Well, there are algorithms to do that. We can learn some of those algorithms and we ourselves can find the square root of a number. It's a little more tedious for us than it is for a computer which can do this task very fast and very reliably. So that's an easy problem for a computer, and to some extent it's an easy problem for a person although it feels like it would take some training and a little bit of work. Given a context-free grammar, produce a sentence using that grammar. In other words, this is not taking sentences and inferring backward to a grammar that produced them. It's you're given a grammar, can you produce a sentence from it? That's very easy. The third example is one that most people would find very difficult in practice. Because using pencil and paper this would be an extremely hard problem, but the technique to solve it is easily programmed into a computer. We can understand the technique to solve it. So this is one of these cases where the computer finds the job easier than people do. But the reason that that's true is because we as programmers know how to write programs to do this kind of thing. Given 100 linear equations with 100 variables, determine whether those equations have a solution, and if so what it is. It would be a tremendously difficult problem for human beings to solve sitting at a desk, but it's not that hard a problem for a computer to solve. So we've looked at a variety of ways in which problems can be difficult, what I'd like to go on and talk about now is some of the language that has made a transition, has influenced our discussions of difficulty or easiness of problems that originated in computer programming. So I'm going to spend some time both in this talk and the following months describing a little bit of some of these ideas about what makes problems easy or difficult. We're going to start with this notion in computer science that goes by the name of order of growth. Here's what I'm thinking about. Imagine a style of problem like for example, solving a Rubik's cube. Now, as you know Rubik's cubes come in a variety of sizes these days. You can get a two by two by two Rubik's cube, or three by three by three, or four by four by four. I think they market a five by five by five. In our minds, we could imagine 1,000 by 1,000 by 1,000 Rubik's cube. No one would ever market or buy such a thing, but we could imagine such a thing in our mind. The idea of order of growth notation is to say how much harder does a certain kind of problem become to solve? How much harder is it to solve a certain type of problem as the problem grows in scale in some natural way? So applying this to Rubik's cube we could ask, how much harder is it in some formal sense to solve a three by three by three Rubik's cube than a two by two by two Rubik's cube? How much harder is it still to solve a four by four by four or for that matter 1,000 by 1,000 by 1,000? What we're really interested when we talk about order of growth is as a problem gets large, as it becomes large-scale, how much more difficult time-wise does it take to solve the problem? Let me give you just to start an example of a very favorable order of growth, a favorable problem for this kind of order of growth notation. It's the guess a number problem. What I do is I tell you I'm thinking of a number between let's say one and 10 and you have to guess that number. If you guess wrong, I will tell you whether your guess is too low or too high and we're limited to integers here. So I'm thinking of an integer in the range 1-10, and you can make a guess. If you guess too low I'll tell you and if you guess too high I'll tell you. If you make the correct guess I'll tell you that. So it's pretty obvious or it should be pretty obvious how you would approach a game like this. You might as well start by guessing the middle number, five. If I tell you that that is too high, all of a sudden now you know you have some information that the number I'm thinking of is somewhere in the range from 1-4. If I tell you it's too low, you know that the number is from 6-10. If I tell you just right, you've solved the problem. Let's say I tell you it's too low, so now you know the number is between six and 10. Six, 7, 8, 9 or 10. What would be the sensible guess to make? Again, guess the middle of the range that you have left. So your second guess would be an eight. If I tell you that's too low, you know that the number is nine or 10. If I tell you it's too high, you know the number is six or seven. Let's step back and say what's the idea here? The idea is if I give you a range and I tell you I'm thinking of a number in that range and I'll tell you if it's too low or too high, you already should be intuiting that there's kind of algorithm for addressing this problem. First, you guess the middle number. If it's too low you've cut the size of the problem in half and now you have the upper half of the numbers in which to guess. If it's too high, same thing. You've cut the size of the problem in half. If it's exactly right, you win. Now, let's think about this in order of growth terms. It only takes a few guesses to get the correct answer for 1-10. How many guesses would it take for 1-100? Well, your first guess would be 50. If I tell you that's too high, you guess 25. If I tell you that's too low, you guess 37 and so forth. Each time you're cutting the size of the problem in half. It takes perhaps seven, eight guesses to get the right answer. Really seven guesses, I guess. It should take seven guesses to get the correct answer to the problem when the range is from 1-100. What about if the range is from 1-1,000? It should take you about 10 guesses. What about if the range is from one to a million? That's a large range. I'm telling you I'm thinking of a number between one and a million. How many guesses will it take you using this algorithm to figure out what number I'm thinking of? The answer is approximately 20. Not that many guesses. So this is a case where the length of time to solve the problem grows at a favorable rate as the problem gets very large. Even when the problem is guess a number from one to a million, the amount of time that it takes to solve that problem hasn't increased that much. In point of fact we would say that this is a problem that exhibits logarithmic order of growth which is there is a slow order of growth, meaning that this is a problem that is relatively easy to solve even when the size of the problem gets harder. I'm going to get a pen here. So let's be a little more specific about what order of growth means. So I'm going to give you a possibly not the most formal definition but something close to a formal definition about what order of growth mean. So the way in which computer scientists talk about this is they'll say for example, they might say that a problem is order n square. You'll see occasional descriptions of that form. What does that mean? First, I'll give you a very informal definition, it means that as the problem gets larger at where we measure the size of the problem, the growth of the problem itself with some number n, the amount of time that it takes to solve that problem grows as pretty much some constant times n square or lower. It really means that we know that the problem won't get harder by more than a factor of some number times n square. Let's be a little more specific. So here's the idea: imagine that we have a graph and I'll put n on this axis, so this n indicates that as n gets larger, the measure of the size of the problem itself gets larger. Now, I'm going to take some constant, which we'll call k, and I'm going to graph kn square. So we'll make this the time taken to solve the problem and this is going to be some function of n. So there is a y-axis here with numbers on it, but this is going to be some function of n that I'm going to draw on this axis. Now, first, I can take the function kn square, and it'll look like this. I have left unspecified what k is, but I'm just saying, I can choose a k. So suppose this happens to be the graph of 4n square, I've chosen 4 for my value of k. When we say that a problem is O of n square, what we mean is that if we graph the time to solve the problem against n, it means that at some point, like here, the graph of the time to solve the problem is going to drop below our kn square graph, and it's going to stay below that. So let me step back and explain again what's going on here. What we're saying is that as the measure of the problem size gets large, the time taken to solve the problem by our particular method is eventually forever lower than some kn square. This is actually a very informative numb description, because for example, a problem that takes n cubed time to solve, if I were trying to draw a graph and the graph happens to be expressed by n cubed, then it would not fit under the graph of 4n square forever after a certain point. Another way we can say this is that a problem that is order n cubed has a larger order of growth than a problem that is order n square, a problem that's order n square is a harder problem than a problem that's order n, a problem that's order n is a harder problem than a problem that grows logarithmically like the guess a number problem that I showed you before. So what I really want you to get out of this description is that when you see a term like order n square, what that's giving you is a rough notion, but an informative notion of how difficult a problem is to solve as the problem gets large. People who have written programs understand in general that an O of n squared problem is to be preferred if a method will solve the problem in O of n square, then that's to be preferred to a method that requires O of n cubed or O of n to the fifth. Now, in this second bullet point here, the crucial for computer scientists be the large-scale most important device in talking about order of growth is the difference between problems that are polynomial in their order of growth, meaning they grow as n squared, n to the fifth, n to the hundredth, and problems that grow as order, say, 2 to the n. A problem that grows exponentially, a problem that grows as not as a polynomial n squared but as the exponent, the n is in the exponent of the expression here, so order 2 to the n, or 3 to the n, or 10 to the n, or 1.5 to the n, in fact any number above one. A problem that grows exponentially is always worse in the large-scale than a problem that grows polynomially. Now, there are many specific cases where if we have to solve a very small problem that happens to grow as 2 to the n, then that may not be difficult for us, it may not require a lot of time. But ultimately, as a problem gets large, the distinction between polynomial time problems and exponential-time problems is profound. To say in the general way, computer programmers when they're given something to do, they tended to hate exponential-time problems, because those mean or if a problem seems to only be solvable in exponential time, then only very small versions of the problem can in fact be approached with a computer. We can't deal with the problem as it gets large, as it gets industrial scale. This isn't the only division, because some problems, in fact, are harder than exponential-time problems. These are often problems that theoretical interest only. Nonetheless, there's an entire range of mathematical difficulty of problems. For computer scientists, the first main divide to be concerned with is that between polynomial and exponential-time problems. I've spend a little more time talking about variants of this. But for now, let's just say that there are other notions that arise from this distinction between polynomial and exponential-time problems. One of them is this idea that you might have heard about NP-Completeness. For now, I'm not going to go into detail about that but I'll just say that an NP-Complete problem tends to be a very interesting type of problem. It's one which to the best of our current knowledge, in order to solve this problem reliably would take exponential time. However, an NP-Complete problem also has an interesting special property, which is that you could guess an answer in polynomial time and you could check whether your guess is correct in polynomial time. So if you're very lucky, you could guess the correct answer and check that it is the correct answer and do that in a reasonable amount of time. But in order to infallibly or reliably get the correct answer seems to take exponential time. That may seem like a fine distinction but a lot of really interesting hard problems turn out to be and not unrealistic problems, turn out to be NP-Complete problems. Finally, this will be a topic that comes up as in this linkage between machines and minds because the brain, as we know, is a highly parallel structure. If you think of the individual neurons as tiny computational elements, then there are billions of these computational elements working all at once. This is an idea of parallel computation. In the computer science world, again linking machines and minds, sometimes you have parallel computational structures, computers with multiple processors, or sometimes we think of the many computers, thousands of computers, all trying to solve a problem simultaneously and solving parts of even sharing answers. That's often a very good strategy for reducing the time needed to solve a problem. In just a matter of theory, it does not really solve the difficulty of going from a polynomial to an exponential-time problem. That is to say, parallelism as a computational strategy can reduce the time needed to solve a problem, but it can't change an exponential-time problem to a polynomial problem. So this distinction that we began with is still in every case a very important one. When we return to this topic, we'll talk about some other ways of thinking about the difficulty of problems for minds and machines.