The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

248 ratings

Stanford University

248 ratings

Course 3 of 4 in the Specialization Algorithms

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

From the lesson

Week 1

Two motivating applications; selected review; introduction to greedy algorithms; a scheduling application; Prim's MST algorithm.

- Tim RoughgardenProfessor

Computer Science

For our next case study of how to use greedy algorithms, we're going to turn to the application domain of scheduling. That is how do you schedule jobs on shared resources in order to accomplish some objective. So, the domain of scheduling there's lots of different applications of greedy algorithms. We'll see two in this course. we'll start for today just with the following simple scenario. So, we'll assume, for today, that there's just one shared resource. This resource could represent any number of things. For concreteness, you can think of it as a computer processor. And then, there's a lot of different things that got to get done. So, for example, there's a lot of processes that have to be handled by this processor. In the algprithmic question, we are going to study, is, in what order should we sequence these jobs? Which one should we do first, which one should we do second, and so on, all the way up to which one should we do last. So, obviously to answer this question, we need to pin down the mathematical model a little bit more precisely. And lets start with just, you know, what is the characteristics of jobs, what information do we have that might lead us to prefer one job over another. But for this problem, we're going to assume that each job comes with two known parameters. So, first of all, job j has what we're going to call a weight w sub j. That's a non-negative real number. And you should think of the weight of a job as quantifying its importance. That is, jobs with a higher weight, in some sense, deserve to be processed earlier than those with a lower weight. And secondly, each job j is going to come with a non negative length l sub j. Depending on the application, you may or may not have a good estimate of how long jobs are going to take, but for today to keep things simple, let's assume we know what the length of every job is, and that's l sub j, it's part of the input to are problem. So. we have now defined the input to this computational problem. We get n jobs each specified by a weight and a length. And we know that the output is going to be a sequence of these n jobs in some order. So, what we have to understand now is what criterion do we want to optimize? What are we trying to accomplish with this sequence? To explain that, I need to tell you about completion times of jobs. So, the completion time of a job is defined hopefully in exactly the way you'd think. So, for the job which is scheduled first, it's just the length of the job because that's how long it takes to process that job. For whatever job gets scheduled second, its completion time is the length of the first job and then, the length of that job itself. So, in other words, it's just the total time which elapses before that job gets completed, okay? So, in general, the completion time of a job is just the sum of the lengths of the jobs scheduled to before that job plus the length of that job itself. To make sure this is clear, let's go through a quick example. So, suppose there are three jobs with lengths one, two, and three. I'm not going to tell you the job weights because they're irrelevant for the purposes of computing the completion time. And let's suppose we do the schedule where we just schedule job one first, the job two, then job three. So, pictorially, I'm going to represent that schedule just by stacking the jobs on top of each other with the interpretation that time starts at the bottom. So, time zero is where we schedule job one. And then, time increases as we go from the bottom to the top of the diagram. And the question then is, what are the completion times of these three jobs? Okay. So, the correct answer is answer C. So, for the first job, it gets scheduled first so it's very happy and it just takes one unit of time to complete, so its completion time is one. The second job, well, it has to wait for the first job to complete so one unit of time elapses and then, it itself has to complete so that's two more units so it gets to the completion time of three. For the third job, it has to wait for the first two to complete, so that adds three to the clock, and then plus it takes three units of time for a total of six. So, that's the definition of job completion times. In some sense, we obviously want completion times to be as small as possible. But it's not so simple. In any given schedule, the jobs that are give early on are going to have small completion times and the jobs towards the end are going to have big completion times. So inevitably, we're going be have to trading off the completion times between different jobs. So, what is the optimal way to so that? Well, that depends on our objective function, and in scheduling, there's many different objective functions you might want to use. today, I'm just going to tell you about one. It's not the only natural objective function, but it's one of several most natural objective functions. It's called minimizing the weighted sum of completion times. You translate this English phrase into mathematics in the obvious way. What you want to do is you want to minimize the sum over all n jobs of their completion time, but then multiplied by their weight of [UNKNOWN] j. Okay. So, the sum over j of w j times c j. The w j is the weight and c j is the completion time as defined on the previous slot. If you think about it for a second, you'll realize this is equivalent to minimizing the weighted average of the completion times with the weights given as in the input. So, just to make sure this makes sense, let's go back to the example that we saw. In that example, we had jobs with lengths one, two, and three, and we thought about to schedule or we scheduled them in that order. To evaluate the subjective function, I'd have to tell you their weights, so let's suppose their weights are three, two, and one, respectively. In this case, the weighted sum of completion times in the schedule, in the previous slide, well first, we begin with a, the first job, which has weight three. Its completion time, remember, was one. Then, we have the second job with weight two, its completion time is three. Then, we have the third job with weight one, its completion time was six. So, we sum up the weighted completion times and we get a total of fifteen. And I'll let you verify that, in fact, all of the three factorial or six schedules in that example, this is, in fact, the schedule that minimizes the weighted sum of completion times. And the algorithmic question we're going to study next, is how do we do this in general? Given arbitrary input in jobs, weights, and lengths, what is the sequence that minimizes this sum over all n factorial sequences you might consider?

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.