So in these next two videos we'll give our second application of the dynamic programming paradigm. It's to a quite well known problem, it's called the knapsack problem. And we'll show how following the exact same recipe that we used for computing independent sets in path graphs leads to the very well known dynamic programming solution to this problem. So let's jump right into the definition of a knapsack problem. So the input is given by n items. So each item comes with a value, V sub i, the more the better for us, and also a size, W sub i. We're going to assume that both of these are non-negative, for the sizes we're going to make additional assumption that their integral. In addition to these two n numbers, we are given one more which is called the capacity, capital W, again we'll assume this is non-negative and an integer. The role of these integrality assumptions will become clear in due time. The knapsack problem, the responsibility of an algorithm is to select a subset of the items. What do we want? We want as much value as possible, so we want to maximize the sum of the values of the items that we select. So what prevents us from picking everything? Well, the sum of the sizes of the items that we pick have to total to at most the capacity capital W. Now I could tell you some cheesy story about a burglar breaking into a house with a knapsack with a certain size capital W and wants to make away with sort of the best loot possible. But that would really be doing a disservice to this problem which is actually quite fundamental. This problem comes up quite a bit, especially as a subroutine in some larger task. Really, just whenever you have sort of a budget of a resource that you can use, and you want to use it in the smartest way possible, that's basically the knapsack problem. So you can imagine how it would come up in a lot of contexts. So let's now execute a recipe for developing a dynamic programming algorithm. Remember, the key to any dynamic programming solution is to figure out what is the right set of subproblems. And we're going to arrive at the sub problems for the knapsack problem just as we did for max weight independent sets by doing a thought experiment about the optimal solution, and understanding what it has to look like in terms of solutions to smaller subproblems. The bottom line of this thought experiment, the deliverable will be a recurrence. It will be a formula which tells us how the optimal value of one subproblem depends on the value of smaller subproblems. So to begin the thought experiment, fix an instance of knapsack and let capital S denote an optimal solution, a max value solution. We began our previous thought experiment with a content free statement that the final vertex of a path is either in the optimal solution or it's not. Now what is the analog of the right most vertex in this knapsack problem? Well, unlike a path graph, there's no intrinsic sequentiality to the items that we're given. They're just an unordered set. But it's actually useful to think of the items as sort literally ordered one, two, three, all the way up to n, and then the analog of the right most vertex is just the final item. So the content free statement we're going to work with here is either the last item belongs to the optimal solution capital S, or it doesn't. We'll again start with the easy case when it doesn't. What we argued in the path graph problem was that the max weight independent set in the analogous case one has to be optimal if we just delete that rightmost edge from the graph. So here, the analogous claim is that this set S should still be optimal if we delete the final item n from the knapsack instance. The argument is the exact same near trivial contradiction. If there was a different solution, s star, amongst the first n minus 1 items with a weight even bigger than that of s, we could regard this equally well as a superior knapsack feasible solution back with all n of the items, but that contradicts the purported optimality of s. So let's go through the slightly trickier case two together using a quiz. So suppose the optimal knapsack solution does in fact make use of this final item n. Now we want to talk about this being somehow composed of an optimal solution to a smaller subproblem. So if we're going to delete the last item, then we can't talk about s, because s has the last item, so we need to remove the last item from s before we talk about its optimality. That's analogous to back in the independent set problem, we removed the right most vertex from the optimal solution before talking about its optimality in a smaller subproblem. So the question then is if we take capital S, the optimal solution, we remove item n, in what sense is the residual solution optimal? Put differently, for what kind of knapsack instance, if any, is that an optimal solution for? All right, so the correct answer is C. So back in the independent set problem what we said is if we remove the right most vertex, then what's left is optimal for the residual independent set problem we get by plucking off the right most two vertices. Here when we remove the nth item from the optimal solution S, the claim is what we get is optimal for the knapsack problem involving the first n-1 items and a residual knapsack capacity of W-w sub n. So the original knapsack capacity with space reserved, or deleted, for the nth item. So before I give you a quick proof, let me just briefly explain why a couple of the other answers are not correct. So first of all, answer B, I hope you could rule out quickly. It just doesn't type check. So capital W, that's the knapsack capacity, so that's in units of size. Little v sub n, that's the item's value. So that's in dollars, so it doesn't really make sense to talk the difference between those two. They're just apples and oranges. Part D, if you were worried about feasibility at any point. So, if you take capital S and you remove the nth item, what have you done? You have taken a set of items whose size was at most capital W by feasibility of s, and you've removed an item with size wn from it. So, what remains has total size at most capital W minus little w sub n. So, S minus n wouldn't be feasible for this reduced to this residual knapsack capacity W- w sub n. A much more subtle point is part A, that's a very natural one to guess. That turns out to not be correct. So it turns out there might be smarter ways of using the first n-1 items, than S minus item n, if you had a full knapsack capacity of W to work with. So that's a subtler point. And it's a good exercise for you to actually convince yourself that A is wrong. That there's no reason that when you take out item n from S and you still keep using the original knapsack capacity that this has to be optimal. That's not going to be true. All right, so why is capital C correct? Well, this is going to have the same spirit of case two of our weighted independent set thought experiment. So let me give you the proof. The proof is going to be the usual contradiction analogous to case 2 of our argument in the weighted independent set problem. So suppose there was something better than S with n removed with the residual capacity, W- wn. Call that supposedly better solution S*, so what can we do to get a contradiction? Well, let's just take S* which involves only the first n-1 items. Let's add item n to it since S* has total size and most W-w sub n and item n has size W sub n, the result has total size of W so it's a feasible solution to take S* and extend it by item number n. And if S* had more value than S with n removed, then S* with n included has more value than S. So for example, if S had total value 1,100, 100 of which was coming from the nth item, then S with n removed had value 1000. If S* was better, it had a value 1,050. Well then we just put n back in and it has value 1150 which contradicts the purported optimal value of S* which had total value merely 1100. So notice what's going on here. So in taking away W sub n for the knapsack capacity before we look at the residual problem we're in effect reserving a buffer for item n if we ever need it. That's how we know we're feasible when we stick n back into the solution S*. That's analogous to deleting the penultimate vertex of the path, again as a buffer to ensure feasibility when we include the nth vertex back in independent set problem. So what was the point of this whole thought experiment which we've now completed? Well, again the point was to say the optimal solution, whatever it is, it has to have only one of two forms. We've narrowed the list of candidates down to two possibilities. Either you just inherit the optimal solution with one less item in the same knapsack capacity, or you look at the optimal solution with one less item and less knapsack capacity by W sub n, and you extend that by item n. But those are the only two possibilities. So again, if we only knew which of these two cases were true. If we only knew whether or not item n was in the optimum solution or not, in some sense we could recursively compute the rest of the solution. So just as that was enough to get us going with a dynamic programming algorithm for weighted independent sets, so it goes with knapsack, as I'll show you on the next video.