Next we're going to talk about an intricate extension of the asymptotic techniques that we're looking at where there's two variables involved. And these arise frequently in the analysis of algorithms. So a lot of times in the study of algorithms, we have two variables that we're working with. One is the size of the problem, and the other is the cost. So we're going to wind up with intricate expressions that involve those two variables. And they might vary independently. So we have some definite challenges. The problem is, as I indicated when talking about doing sums is that the relative values of the variables might matter. And that makes it even more challenging to deal with sums over the whole range of relative values. And you'll see what this means when I get to some applications. Fortunately, as was the case with the harmonic numbers and Stirling numbers, where there's a couple of fundamental functions that arise over and over again that take most of the work, similar is true of functions of two variables. There's a couple of fundamental functions that arise. And we'll look at the asymptotics of those, and they are the ones that arise most often. And if it's not that function, it's a function that's similar that we could model the analysis on the analyses that have been developed classically for these functions. So first example is the binomial distribution. Again, so k might be related to the cost, N might be related to the problem size. How are we going to compute values like that for [COUGH] how are we going to make computations if we wind up with an expression like that? It turns out that as I just explained for the Catalan numbers, if k is 0, that's close to 1 over square root of pi N, that's two entries in. But if k is large, that's exponentially small, it's a very, very tiny quantity. So we're going to have to come up with an analysis that tells us about these things. Another example is called the Ramanujan Q-distribution, that's actually kind of similar. A binomial coefficient is N factorial over N minus k factorial k factorial. So this one's N factorial over N minus k factorial N to the k. And that actually arises in the analysis of several classical algorithms. And that one, if k equals 0, it's just N factorial, N factorial, it's 1. But if k is close to N, that one is exponentially small. N to the N is way smaller than N factorial. So those are the types of functions and types of challenges that we face with analyzing functions of two variables. Now, to get an idea of what goes on, we use plots like this where we draw one line for each value of k and then we scale. You can't, in a graph, you can't have a variable N, so we make that a variable N by scaling the actual values by a factor of 2N. So the binomial coefficients, so [COUGH] for k equals 2, it's one quarter, one and a half, one quarter, then, or just forgetting about the exponential scaling factor, that's 1, 2, 1, and this is 1, 3, 3, 1. 1, 4, 6, 4, 1. So you can see Pascal's triangle in these plots. And what happens, as many people are familiar, is that as N increases, this discrete two variable distribution converges to the normal distribution, to a curve that we can describe with a simple mathematical function. And we're going to see the math to get us to the place that this picture shows us. We ought to be able to have a simple description of this distribution, and we do. This is what Ramanujan Q-distribution looks like. Again, the same type of plot. For different factors of k, we draw a curve. So k goes from 0 to N, when k is small, it's very close to 1, when k gets close to N over 2, it becomes extremely small. And then there's a curve that describes its growth. So for large N, we ought to be able to use some function that defines that curve for any value of k, and that's what we're after with bivariate asymptotics. Now let's take a look first at the Ramanujan Q-distribution. The calculation's a little bit simpler. It's an extension of the calculation that we did for the Catalan numbers. It's just that now we're carrying on this variable k. So first step is use the x log technique. N factorial over N minus K factorial N to the k is e to the log of N factorial minus log of N minus k factorial minus log of N to the k, which is k log N. So that's the first step. Now, how we going to expand each one of those? Well, the first two are just handled with Stirling. So we'll use Stirling's approximation to log N factorial and just plug that in in both of these cases. So that says log N factorial is N plus one-half log N minus N plus log of square root of 2 pi plus O of 1 over N. Now, the difference is, so that's what we'll use for the first term. For the next term, we have that value of k that we have to worry about. So we're going to have O of 1 over n minus k, which we can cover with O of 1 over N. So that's plugging in Sterling's formula for the first two terms. And then there's the minus k log N term still left there from the N to the k. So again, we've got a bunch of terms, three terms for the log N factorial, three terms for the log N minus k factorial, and then the minus k log N. And we've got to do algebra to deal with those terms. But again, there's some cancellations that are going to help us out. Log square root of 2 pi cancels. The minus n plus n cancels. There's, log of n minus k is log of n plus log of 1 minus k over n, so we get some cancellations there. So if we collect all those terms, with all the cancellations, that's what we're left with, e to the minus N minus k plus one-half log of 1 minus k over N. You might want to check the math on that, but it's all simple algebra that leaves us with that. And the only technique used is log of N minus k equals log N plus log of 1 minus k over N, and we're down to just a few terms. [COUGH] So now we have to expand the log of 1 minus k over N. And that's minus k over N minus k squared over 2N squared plus O(k cubed / N cubed). So now we're carrying both variables in the big-O term, which can be a little tricky. We're trying to make this work for as many values of k as we can, but different values of k, particularly when k is proportional to N, are going to give us different asymptotic accuracy. And that's one of the challenges with bivariate asymptotics. But we'll carry it in this form and see where we go. So just plugging in that formula and doing the math. So there's minus k over N, we're multiplying by N, and again, you can go ahead and do the math, not too much survives. There's k squared over 2N minus k squared over N, so that's minus k squared over 2N. The k's cancel. So all we get is e to the minus k squared over 2N in the end. And then what's left over are the two error terms. And those error terms are just doing the math. If you multiply N times a of k cubed over O(k cubed / N cubed) you get k cubed / N squared. And if you multiply k times, sorry, and then what's left is the O(k / N). So those are valid formulas, but the interpretation of the formula's going to depend on the value of k. And it tells us a lot, that for small k, for a relatively small k, it says it's going to be close to e to the minus k squared over 2N. That's the curve that we see. So if we just analyze sort of different values, if this, in fact, if k is N to the two-fifths, say, a which is a pretty good sized value, then k over N is going to be 1 over N to the three-fifths, k cubed over N squared is 1 over N to the four-fifths. So that's the error term. Seems like we're shaving into tiny distinctions, but the main point is 1 over N to a positive power. That's going to be a big distinction. When k is square root of N, they're about the same. Now, when k gets bigger, then this term starts to pick up. But we can work with those different ranges, and really the main point of this derivation is to show that for most of the curve, remember, the plot showed up to square root of N, it's got a fine curve. And we have the description of that curve, and we have it. It's e to the minus k squared over 2N. Which is a simple function to work with, as opposed to the original description involving factorials. A similar situation works for the binomial distribution. And again, this is just an exercise using Stirling's formula, carrying through all the error terms, and making sure that you get the cancellations. And it's definitely worthwhile to close the book, turn off the computer, and try to do this derivation just as an exercise in algebra perhaps. Because you can see when you do that how the cancellations happen and simplify the calculations. And everybody who works in this sort of field knows that if you do a long calculation like this and you miss a term, you're going to get something very exciting at the end that maybe is not relevant or wrong, simply by missing a term. Because, your e to a power, and you accidentally forget to cancel out an N term, you've got e to the N'th, which probably is not there at all, just as an example. So we'll do the calculations. I'm not going to explain every term on these calculations, because they're actually very similar to the ones that we just did. You apply Stirling's formula, and so that gives three lines, each with four terms, one for each of the main terms. And again, initially we can just use O(1/N) for the whole range. Where we start to have to carry terms that involve O(k/N) is when we expand log N minus k to be log of N plus log of 1 minus k over N and log of N plus k to be log of N plus log of 1 plus k over N. Those k over N's in those asymptotic series are the ones that give us some complication. So when we do this and do the algebra, a bunch of things cancel. The 2N cancels with the 2N's, and so forth, and so these are the terms that survive. And again, that's pretty much straightforward algebra, with the additional proviso that we're doing that expansion log N minus k equals log N plus log 1 minus k over N. The terms [INAUDIBLE] log N most of them go. And so we're left with that. And then these two terms are kind of similar. One's got a minus k, one's got a plus k. And just rearranging terms in terms of k times this difference of log 1 minus k over N log 1 plus k over N, and N plus one-half times the sum, those two, that sum and that difference, that's one of the first exercises that we did. And those things collapse down to just one asymptotic term. So the sum of them is asymptotic to minus k squared over N squared. And the difference of them is asymptotic to minus 2 over N, with, again, big-O terms involving k cubed. So substituting that in gives us e to the 2N log 2 minus natural log of square root of pi N, that's the only part of pi that's left over from all the math, minus k squared over N plus the big-O term. And then just undoing the explog technique, e to the 2N log 2, that's 4 to the N. e to the minus log square root of pi N, that's 1 over square root of pi N. And then what's left is e to the minus k squared over N. So [COUGH] 1 / 4N (2N choose N- k) is e to the -k squared / N / square root of pi N. And that's the normal approximation to the binomial distribution. And that's accurate for a broad range of values of k. For this approximation, it's only when k gets bigger than N to the three-fourths that the approximation doesn't work. And again, remembering from the curves, when k is that big, we can use independent means to prove that the terms are very, very small. So, there's certainly a lot of math on this slide. On the other hand, the bottom line is a fundamental classical result in mathematics and the techniques used. What makes it complicated is really just elementary algebra. So it's an exercise in doing elementary algebra well. And it's interesting, these kinds of calculations, still, nowadays, most people do them by hand because it's difficult to have automatic calculations carry through on the bivariate to the extent that we'd like, say. So lots of people still do these kinds of calculations by hand. And there's plenty of other examples in the book, and so I'm not going to go through all those examples. Again, these functions arise very often, and we can go back to a table like this to get the approximations that we need later on when we encounter these functions in applications. Again, different arguments depending on different ranges of k. In most cases, we'll have [COUGH] an approximation that's true no matter what the value of k is. And then other times, we'll have a more refined approximation that'll give us a little more accuracy near the center. So that's the end result for the normal distribution. Then there's the so-called Poisson distribution, which I haven't talked about very much, but which falls to the very same kinds of techniques. And I'll talk about it when we come to it in the context of applications. And again, look at the derivation in the book or you can just deal with that by using the exp-log trick. A lot of things cancel, and the end result is a famous result that for [COUGH] that N choose k times that binomial for [COUGH] a probability that's got a small chance of occurring is lambda to the k e to the- lambda / k factorial. And we'll talk about applications of that later on. Right now, just in terms of the math, you should appreciate that. The very same techniques that we did on one slide will give these kinds of approximations. And the Q distribution that I talked about, e to the -k squared / N, so within 1 / square root of N, uniform or, more accurately, near the center. Those are the kinds of bivariate approximations that we can develop. And they're extremely important in, not just analysis of algorithms, but in many applications. It's far easier to work with the standard mathematical functions like e to the -k squared / N than it is to work with the factorial representations, which are exact but carry a lot of information that make calculations difficult. So most of the time, we'll be coming back to this table and picking out these kinds of approximations when these sorts of functions arise in the analysis of algorithms. And we have similar functions that arise. We'll go back to the derivations and see that they can be handled with the same basic method. Really, the exp-log technique plus the application of Stirling's formula is what gives all these kinds of results. So the next challenge that we're going to have now is, though, what do we have when we have a sum that involves one of these functions? And so the example that we'll do is the so called Ramanujan Q-function, and that's a critically important function in the analysis of algorithms. So, it's the Ramanujan distribution summed over all values of k. And it's basically asking, what's the area under the curve? And the challenge is that we're going to need, again, it's nearly 1 for small k and it's negligible for large k, so we're going to need to use different approximations in different parts of the range to get the answer. This is a very typical situation. So that's why we need bivariate asymptotics to make sure that we can have good estimates in the whole range that we can use to estimate the sum. The general method is referred to as the Laplace method. If you have to approximate a sum, and this is just a schematic representation of the situation, what usually winds up being effective is that the tails are going to be small, so we'll restrict the range to an area that has what counts and then find an approximation that works there. And then what we can do is extend the range. The point is that that approximation usually is also going to be very small on the tails, so we can just extend the range back out. So we take out the original tails, we put in the approximation. Both of them are exponentially small by comparison with the value of the whole sum, so it doesn't matter which one that we use. And then that gives us a way to get a concise approximation of the whole sum. That's called the Laplace method. And then, usually, one of the advantages of converting from a representation involving factorials to a representation like e to the -k squared / N is that we can use an integral for functions like that and then use that approximation to get the answer, where an integral with discrete doesn't do the job for us, usually. So let's look at how this method works for the Q-function. And it's actually very straightforward. So what we're going to do is just pick a value k0 and split into two parts for values less than k0 and values bigger than k0. Now, remember, for small k is where it's significant, for large k, it's going to be negligible. And going from the [COUGH] approximations that we developed before, for example, if you take k0 to be N to the two-thirds, then the tail's going to be exponentially small. I don't want to do the detail of that. And then the tail's exponentially small and not only that, we have a good approximation of the sum n for k less than k0. For all of those values, it's not far from e to the -k squared over 2n. That's the approximation that we just developed. It's very close to that, actually. But that one also for large K is going to be exponentially small. So we might as well just extend the range back to be an infinite sum, because the tail's also exponentially small for that one. And now that sum, we can just approximate with an integral. And if you do that, you get square root of Pi N over 2. And again that's the true value of doing asymptotic calculations. This function Q(N) while it's a precise description of some mathematical quantity that's going to be really difficult to compute. But if you notice square root of pi N over 2, then that's something that you can work with. And we'll be coming back to applications that use this function later on. That's an introduction to bivariate asymptotics. Now I want to finish up by giving a few exercises that you might do before the next lecture to test your understanding of asymptotics. So the first one has to do with where I started with is, how small is exponentially small? And this is just trying for a couple of different values of alpha and beta, comparing the values of alpha vn and beta vn. And really showing that alpha vn, the larger one is going to be really good [COUGH] approximation. So here's another one. An opportunity to go through those calculations that I did it in the last section. There's another Ramanujan function, actually, there's three famous ones. But there's another one called the P function. It's also the r function which is discussed in the book. And that's [COUGH] this function, sum over k, N- k to the k, N- k factorial, or n factorial. That one also is approximated by square root of pi, N/2. Going through the steps that I did for the q function for this function will be very instructive if you want to test your understanding of this material. So one thing that you might do just to get started Is write a program that'll print log base 2 of N choose k. And just use what we've talked about in this lecture to get that done. That's an interesting exercise. And write up solutions to the two exercises that I just gave. And again, if you're not comfortable with using tech either on paper or with math checks and HTML, that'll certainly provide some practice, or I wrote it up by hand. I don't write stuff up by hand anymore but some people do. And then read the chapter on asymptotics in the text for much more detail on all the information we've covered. In the next lecture we'll put together all the things that we've talked about in the previous lectures and see how they fit together to form basis of analytic combinatorics, that we can then go on to apply to the analysis of algorithms.