0:00

So the time has arrived for us to finish the proof of the fact that this

Â deterministic algorithm based on the median of median ideas, does indeed run in

Â linear time. We've done really all the [inaudible] difficult work. We've

Â discussed the algorithmic ingenuity required. To choose a pivot

Â deterministically that's guaranteed to be pretty good. So remember the idea was you

Â take the input array, you logically break it into groups of five, you sort each

Â group. That's like the first round of a two round knockout tournament. The winners

Â of the first round are the middle elements of each group of five. That's the initial

Â set of medians. And then the second around we take a median of these N over five

Â first round winners, and that's what we return as the pivot. And we proved this

Â key lemma which is the 30/70 lemma, which says that if you choose the pivot by this

Â two round knockout tournament, you're guaranteed to get a 30/70 split or better.

Â So your recursive call in line six or seven. Of having a de-select is guaranteed

Â to be on an array that has at most 70 percent of the elements that you started

Â with. In other words you're guaranteed to prune at least 30 percent of the array

Â before you recurs again. But what remains to understand is whether or not we've done

Â a sensible trade off. Have we kept the work required to compute this 30/70 split

Â small enough. That we get the desired linear running time. Or have we, is the

Â cost of finding a pretty good pivot outweighing the benefit of having

Â guaranteed good splits? That's what we gotta prove. That's the next subject.

Â Here's the story so far. You'll recall that, as usual, we define T of N to be the

Â worst case running time of an algorithm. In this case, D select on inputs of array

Â length. I didn't put arrays of length N. And we discussed, okay, there's the base

Â case as usual. But in the general case, we discussed how, outside of the two

Â recursive calls. The deselect algorithm, there's a linear number of operations.

Â What does it have to do? It has to do the sorting, but each sorting is on a group of

Â sized constants, size five, so it takes constant time for a group. There's a

Â linear number of groups, so step one takes linear time, the copying takes linear

Â time, and the partitioning takes linear time. So, there's some constant C, which

Â is gonna be bigger than one, but it's gonna be constant. So, then outside of a

Â recursive cause. Deselect always does at most C times N operations. Now what's up

Â with the recursive calls. Well, remember there's two of them. First, there's one on

Â line three that's just responsible for helping choose the pivot. This one we

Â understand. It's always on twenty percent of the imputed rate of like the first

Â round winners, so we can very safely write T of N over five for the work done, in the

Â worst case, by that first recursive call. What we didn't understand until we proved

Â the key lemma was what's up with the second recursive call, which happens on

Â either line six or line seven. The size of the imputed rate on which we recursed

Â depends on the quality of the pivot, and it was only when we proved the key lemma

Â that we had a guarantee on the. [inaudible] 30-70 split or better what

Â does that mean? That means the largest sub-array we could possibly recurs on has

Â seven-tenths N elements. So what remains is to find the solution for this

Â recurrence and hopefully prove that it is indeed big O event. So I'm going to go

Â ahead and rewrite the occurrence at the to of the slide. We're not really going to

Â worry about the T to one equal one. What we're interested in is the fact that the

Â running time on an input of length N is at most C times N. Where again c is some

Â constant which is gonna have to be at least one, given all the work that we do

Â outside of the recursive calls. Plus the recursive call on line three on an array

Â of size n over five. Plus the second recursive call, which is on some array

Â that has size in the worst case seven-tenths n. So that's cool. This is

Â exactly how we handle the over deterministic divide and conquer

Â algorithms that we studied in earlier videos. We just wrote down a recurrence

Â and then we solve the recurrence, but now, here's the trick. And all of the other

Â recurrences that came up. For example, Merge Short, Strassner's Matrix

Â Multiplication Algorithm, [inaudible] multiplication, you name it. We just plug

Â the parameters into the masters method. And because of the power of the master

Â method, boom! Out popped up an answer. It just told us what the recurrence evaluated

Â to. Now, the master method, as powerful as it is, it did have an assumption, you

Â might recall. The assumption was that every sub-problem solved had the same

Â size. And that assumption is violated by this linear time selection algorithm.

Â There are two recursive calls. One of 'ems on twenty percent of the original array.

Â The other one is probably on much more than twenty percent of the original array.

Â It could be as much as 70 percent of the original array. So because we have two

Â recursive calls, and sub problems of different size, this does not fit into the

Â situations that the master method covers. It's a very rare algorithm in that regard.

Â 4:34

Now, there are more general versions of the master method, of the master theorem

Â which can accommodate a wider class of recurrences including this one here.

Â Alternatively we could push the recursion tree proof so that we could get a solution

Â for this recurrence. Some of you might want to try that at home. But I want to

Â highlight a different way you can solve recurrences just for variety, just to give

Â you yet another tool. Now the good news of the, about this approach that I'm gonna

Â show you is that it's very flexible. It can be used to solve sort of arbitrarily

Â crazy recurrences. It's certainly going to be powerful enough to evaluate this one.

Â The bad news is that it's very out of hock. It's not very necessarily very easy

Â to use. It's kind of a dark art figuring out how to apply it. So it's often just

Â guess and check, is what it's called. You guess what the answer to the recurrence is

Â and then you verify it by induction. Here, because we have such a specific target in

Â mind, the whole point of this exercise is to prove a linear is not bound, I'm gonna

Â call it just hope and check. So we're gonna hope there's linear of time and then

Â we're gonna try to produce a proof of that just that verifies the linear time bound

Â using induction. Specifically what are we hoping for, we're crossing our fingers

Â that there's a constant, I'm going to call it A, A can be big but it's got to be

Â constant, and again remember constant means it does not depend on N in any way.

Â Such that our recurrence at the top of this slide T-N is bound above by A times N

Â for all and at least one. Why is this what we're hoping? Well suppose this were true.

Â By definition T of N is a upper bound of the running time of our algorithm. So it's

Â bound and [inaudible] by A times N then it's obviously an O event. It's obviously

Â a linear time algorithm. It's obviously A gets that gets suppressed in the big

Â rotation. So that's the hope, now let's check it. And again, check mean just

Â verify by induction on N. So the precise claim that I'm going to prove is the

Â following. I'm gonna go ahead and choose the constant A. Remember all we need is

Â some constant A, no matter how big as long as it's independent of N. That'll give us

Â the big O of N time. So I'm actually gonna tell you what A I'm gonna use for

Â convenience. I'm gonna choose A to be 10C. Now what is C? C is just a constant that

Â we inherit from the recurrence that we're given. Now remember what this recurrence

Â means is this is what the running time is of the deselect algorithm and the C times

Â N represents the work that's outside of the recursive calls. So this is just a

Â constant multiple on the amount of linear work that deselect does for sorting the

Â groups, for doing the partitioning and for doing the copying. And so there's gonna be

Â some small task at a reasonable cost and, and for the proof I'm just gonna multiply

Â that by ten and use that as my A. And the claim is if I define A in that way then

Â indeed, it is true that for all and at least one, T of N is banded above by A

Â times N. Now, I realized I just, I pulled this constant A out of nowhere, right? Y10

Â times C. Well, if you recall our discussion when we proved that things were

Â big O of something else, there again, there was some constant. So to formally

Â prove that something is big O of something else, you have to say what the constant

Â is. And in the proof, you always wonder how do you know what constant to use? So,

Â in practice, when you're actually, if you have to actually do one of these proofs

Â yourself, you reverse engineer what kind of constant would work. So you just go

Â through the argument with a generic constant. And then you're like, oh, well,

Â if I set the constant to be this, I can complete the proof. So we'll see, that's

Â exactly what's gonna happen in the proof of this claim. It'll be obvious. The very

Â last line you'll see why it shows A equals 10C. So I just reverse engineered what I

Â needed for the proof. But to keep the proof easy to follow line by line I

Â decided to just full disclosure tell you the cost right at the beginning. Now no

Â prizes for guessing that the way this proof proceeds is by induction on N.

Â Induction's the obvious thing to use, we're trying to prove an assertion for

Â every single positive number N and moreover we're given this recurrence which

Â relates solutions of smaller sub-problems to that of bigger problems. So that sets

Â things up for use of the inductive hypothesis. If you want a longer review of

Â what proofs by induction are, I suggest that you go back and re-watch the optional

Â video where we prove the correctness of quicksort. That is, is a fairly formal

Â discussion of what the template is like for a proof by induction. And that's the

Â one we're gonna apply here. So, there's two ingredients in any proof by induction

Â is, is a usually trivial one in the form of a base case. That's also gonna be

Â trivial here. In the base case you just directly establish the assertion when n

Â equals one. So, we're trying to prove that t of n is the most a times n for every n

Â when n equals one we could just substitute. But what we're trying to prove

Â is that t of one is at most a time one also known as a. And we're given that t of

Â one is one. Right that's the base case of the recurrence that we're given. So that's

Â what we're using here. What we want to be true is that this isn't the most a times

Â one, but it is. So the constant c we're assuming is at least one, so it certainly

Â can multiply c times ten to get a. It's definitely at least one. So the right hand

Â side here is unquestionably bigger than the left hand side. A in fact is bigger

Â than ten, let alone bigger than ten. So the interesting ingredient is generally

Â the inductive step so remember what you do is here is you assume you've already

Â proven the assertion that, in this case the T of N is at most AN for all smaller

Â integers, and now you just merely have to prove it again for the current integer. So

Â we're now interested in the case where N is bigger than one and the assumption that

Â we've already [sound] proven to everything smaller is called inductive hypotheses. So

Â what does it mean that we already proved it for all smaller numbers, that means we

Â can use in the proof of our inductive step the fact that P of K is the most a times K

Â for all K strictly less than N. All we gotta do is enlarge the range of N's to

Â which this holds to one more to the current value N. And all we have to do is

Â follow our nose. So pretty much, we, we have to start on the left hand side with T

Â of N, and we have to wind up on the right hand side with A times N. And pretty much,

Â at every step of the proof, there's just gonna be one conceivable thing we could

Â do. So we just follow our nose. We start with what we wanna upper bound, T of N.

Â Well, what do we got going for us? The only thing we can do at this point is

Â invoke the recurrence that we were given up here. So we have an upper bound on T of

Â N in terms of the T value of smaller integers. So we are given that T of N is

Â at most C times N, plus T of N over five, plus T of seven-tenths N. Of course

Â ignoring fractions, you would round up or round down, if you wanted to be precise,

Â and the auxiliary lecture notes are more precise, if you want to see what the gory

Â details look like. But it's really just exactly the same argument. One just has to

Â be a little bit more anal about it. So now that we've invoked the recurrence, what

Â can we possibly do, right? We can't really do any direct manipulation on any of these

Â three terms. But fortunately, we have this inductive hypothesis. That applies to any

Â value, any integer which is less than N. So we have her, N/5, that's certainly less

Â than N. We have 70 percent of N. That's certainly less than N. So we can apply the

Â inductive hypothesis twice. We already know that these T values are bounded above

Â by A times their arguments. So T of N over 5's at most A, times N over five. T of

Â seven-tenths N is at most A, times seven-tenths N. Now we can group terms

Â together, not we're comparing apples to apples. So we have N, times quantity C,

Â plus A/5, plus seven-tenths A. Let me just go ahead and group the two A turns

Â together. And that's gonna be nine-tenths A. No, don't forget where we're going,

Â what the end goal is. We want a upper bound T of N by AN. So we wanna write that

Â this is bounded above by A times N. And now you see exactly how I reverse

Â engineered our choice of A, as a function of the given constant C. Since A is ten

Â times as big as C, if I take 90 percent of A and add C, I just get A back. So by our

Â choice of A. This equals AN. And that pretty much wraps things up. So don't

Â forget what all this stuff stands for. So what did we just prove? What did we just

Â prove by induction? We proved T of N is, at most, a constant times N for every N.

Â That is, T of N is Big O of N. What was T of N? That was the running time of our

Â algorithm. That's all we cared about. So because T of N is Big O of N, indeed,

Â deselect runs in O of N time.

Â