0:00

In this video we'll discuss how we actually implement Dijkstra's shortest

path algorithm. And in particular, using the heap data structure, we'll give a

blazingly fast implementation, almost linear time. Let me just briefly remind

you of the problem we're solving. It's the single source, shortest path problem. So

we're given the directed graph and a source vertex, S. We're assuming that

there's a path from S to every other vertex, V. If that's not true, we can

detect it with an easy pre-processing step, so our task then is just to find the

shortest path amongst all of them from the source vertex, S to each possible

destination, V. Moreover, every edge of the graph has a non-negative edge length

which we're denoting, else of V. So recall that Dijkstra's algorithm is driven by a

single Y loop. So we're going to add one additional vertex to an evolving set

capital X as the algorithm proceeds. So X is the vertices that have been processed

so far. We maintain the invariant that for every processed vertex we've computed what

we think the shortest path distance is to that vertex. So initially X is just the

source vertex S. Of course, the shortest path distance from S to itself is zero.

And then the cleverness of Dijkstra's algorithm is in how we figure out which

vertex to add to the set capital X each iteration. So the first thing we do is we.

Focus only on edges that cross the frontier edges that have their tail in

capital X and their head outside of capital X. Now, of course, there may be

many such edges, Edges that cross this frontier and we use Dijkstra's Grady

criterion to select one of them. So for each crossing edge, each edge with a tail

and X and head outside X, we compute the Dijkstra Grady score that is defined as

the previously computed shortest path distance to the tail of the arc plus the

length of the arc. So we compute that for each crossing edge and then the minimum

edge we're calling it V star W star. That determines how we proceed. So we add the

head of that arc W star to the set capital X and then w e compute the shortest path

distance to W star to give the previous. See computed shortest fast distance to V

Star plus the length of this extra [inaudible] V Star, W Star. Now back when

I explained this algorithm I did it using two arrays, array capital A and array

capital BA Is what computed the shortest path distances, and remember that's what

the problem asks us to compute. And for clarity I also filled up this array

capital B just to keep track of the shortest paths themselves. Now if you look

at the code of this algorithm, we don't actually need the array capital B for

anything. When we fill in the array capital A, we don't actually refer to the

B array. And so now that we're gonna talk about real implementations of Dijkstra;

I'm actually gonna cross out all of the instructions that correspond to the B

array. Okay? Because you would not, as I told you earlier, use this in a real

implementation of Dijkstra. You would just fill in the shortest path distances

themselves. So in the next quiz, what I want you to think about is the running

time of this algorithm if we implemented it more or less as is, according to the

pseudo code on this slide without any special data structures. And in the

answers to the quiz, we're going to be using the usual notation where M denotes

the number of edges in the graph, and N denotes the number of vertices of the

graph. So the correct answer to this quiz is the fourth one that the straightforward

implementation of Dijkstra's algorithm would give you a running time proportional

to the product of the number of edges and the number of vertices. And the way to see

that is to just look at the main while loop and look at how many times it

executes and then how much work we do per iteration of the while loop if we

implemented it in a straightforward way. So there's gonna be N minus one iterations

of the while loop. And the reason is, is that the algorithm terminates once every

single vertex has been added to capital X. Their end vertices, initially there's one

vertex in X. So after N m inus one iterations, we'll have sucked up all of

the vertices. Now what's the work done in each wild loop? Well basically, we do

naively a linear scan through all of the edges. We go through the edges. We check

if it's an eligible edge, that is if its tail is in X and its head is outside of X.

We can keep track of that just by having an auxiliary bullion variable for each

vertex. Remembering whether it's an X or not. And then amongst all of the illegible

edges, those crossing the frontier, we just, by exhaustive. Search remember which

edge has the smallest Dijkstra store- score, now we can compute the Dijkstra score in

constant time for each of the edges. So that's a reasonable algorithm. We might be

able to get away with graphs that have say hundreds or thousands of vertices using

the straight forward of implementation, but of course, we'd like to do better.

We'd like the algorithm to scale up to much larger graphs, even graphs with

potentially say a million vertices So the answer is yes, we can do better. Not by

changing the algorithm, but, rather, changing how we organize the data as the

algorithm proceeds. So this will be the first time in the course where we use a

data structure to get an algorithmic speed-up. So we're gonna see a really

lovely interplay between, on the one hand, algorithm design and, on the other hand,

data structure design in this implementation of Dijkstra's algorithm. So

you might well ask what's the clue that indicates that a data structure might be

useful in speeding up Dijkstra's shortest path algorithm. And the way you'd figure

this out is you'd say, well, where is all this work coming from? Why are we doing a

linear amount of work in the edges for a linear number in the vertices iterations?

Well, at each iteration of this while loop, what we're doing is, we're just

doing an exhaustive search to compute a minimum. We look at every edge, we look at

those that cross the frontier, and we compute the one with the minimum Dijkstra

score. So we could ask ourselves, oh, if we're doing minimum comp utations over and

over and over again, Is there some data structure which, whose raison d??tre,

whose reason for being is in fact to perform fast minimum computations? And in

fact there is such a data structure. It's the heap data structure. So in the

following description of a fast implementation of Dijkstra's algorithm, I'm

going to assume you're familiar with this heap data structure. For example, that you

watched the review video elsewhere on the course site that explains it. So let me

just remind you with a lightning-quick review of what we learned in that video,

So heaps are generally logically thought of as a complete binary tree, even though

they are usually implemented as a laid-out linear array. And the key property that

you get to leverage but that you also have to maintain in a heap is the heap property

that at every node the key at that node has to be at least as small as that of

both of the children. This property ensures that the smallest key of them all

has to be at the root of this tree. To implement extract menu, just pluck off the

roots. That's what you return. That's the minimum element. And then you swap up the,

bottommost rightmost leaf. The last element, Make that the new root, And then

you bubble that down as necessary to restore the heap property. When you do

insertion, you just make the new element the new last leaf, bottommost rightmost

leaf, and then you swap up as needed to restore the heap property. When we use

heaps in Dijkstra's algorithm we're also going to need the ability to delete an

element from the middle of the heap. But again you can do that just by swapping

things and doubling up or down as needed. I'll leave it as an exercise for you to

think through carefully, how to delete elements from the middle of a heap.

7:00

Because you're maintaining the heap as an essentially perfectly balanced binary

tree, the height of the tree is roughly the log base two of N, where N is the

number of elements in the heap. And because for every operation, you implement

it just by doing a constant amo unt work at each level of the tree, all of these

operations run in O of log N time, where N is the number of items that are being

stored in the heap. As far as the intuitive connection between the heap data

structure and Dijkstra's Algorithm. In the main wild loop of Dijkstra's Algorithm,

we're responsible for finding a minimum, every single iteration. What are heaps

good for? They're good for finding minimums in logarithmic time. That sounds

a lot better than the linear time we're spending in the naive implementation of

Dijkstra's Algorithm. So let's now see how to use heaps to speed up Dijkstra's

shortest path algorithm. Now because every iteration of the wild loop is responsible

for picking an edge, you might expect that we're going to store edges in the heap. So

the first subtle but really good idea is to actually use a heap to store vertices

rather than edges. Going back to the pseudo-code [inaudible] algorithm,

remember that the only reason that we focused on an edge. Well so that we can

then deduce which vertex, namely the head of that edge, to add to our set capital X.

So we're just going to cut to the chase, we're just going to keep vertices not yet

in X and then when we extract them in from the heap, it'll tell us which is the next

vertex to add into the set capital X. So the picture we're going to wanna have in

mind is Dijkstra's choice path algorithm at some intermediate iteration. So

there'll be a bunch of vertices in the, the set capital X source vertex plus a

bunch of other stuff that we've sucked into the set so far. And then there'll be

all the vertices we haven't processed yet. A big group V minus X. Then there's gonna

be edges crossing this cut in both directions from X to V minus X and vice

versa. Now before I explain the second invariant, let's just recall what the

straightforward implementation of Dijkstra's Algorithm needs to do. What it would do is

search through all the edges and it would look for any eligible edges. Those with

tail and X, and head and V minus X. So in this picture, there w ould be three such

edges. I've drawn the example so that two of the edges, the top two edges, both

share a common head vertex whereas the third edge has its own head vertex. The

straightforward of limitation of Dysktra's Algorithm we?d compute Dystra's greedy

score for each of these three edges And remember, by definition, that's the

previously computed shortest path distance to the tail of the arc V, plus the length

of the arc VW. So the straightforward implementation just computes this. In this

case, it would compute it for three edges. And whichever the three edges won had the

smallest score. The head of that edge would be the next vertex that gets added

to X. So let me specify the second invariant, and then I'll tell you how to

think about it. So, because we're storing vertices rather than edges in the heap,

we're going to have to be fairly clever with the way we define the key of a vertex

that's in this heap. [sound] So we're going to maintain the property that the

key of a vertex V is the smallest greedy Dijkstra score of any ver, any edge which

has that vertex as its head. So let me show you what I mean in terms of our

example, where we have three crossing edges. Suppose for these three edges in

the upper right that happen to have of Dijkstra [inaudible] scores of seven,

three, and five. Let's look at what the key should be for each of these three

vertices I've drawn in V minus X. Now for the timeline vertex this is pretty

interesting. There are two different edges whose tail is in X, and have this vertex

as their head. So what should the key of this vertex be? Well, it should be the

smallest Dijkstra greedy score of any of the edges whose tail lies on the left-hand

side that terminate at this vertex. So there's two candidate edges. One has

Dijkstra greedy score three. One has Dijkstra greedy score seven. So the key

value should be three, the smaller of those two. Now, the second vertex, there's

only a single edge that has tail in X and that terminates at this vertex. So the key

for this vertex should jus t be the score at that weak edge. So in this case that's

gonna be five. And then this poor third vertex, there's actually no edges at all,

that, started X and terminated at this vertex. There's only one arc going the

wrong direction. So for any edge, sorry, for any vertex outside of X that doesn't

have any eligible edges terminating at it, we think of the key as being plus

infinity. So the way I recommend thinking about these heap keys is that we've taken

what used to be one round tournament, winner takes all And we've turned it into

a two round knockout tournament. So in our straightforward implementation of

Dijkstra's algorithm, we did a single linear search through all the edges, and

we just computed the [inaudible] Dijkstra's score for each and we picked

the best. So in this example we would have discovered these three edges in some

order. Their scores are three, five, and seven. And we would have remembered the

edge with score three as being the best. That would have been our winner of this

winner take all tournament. Now when we use the heap, it, we're factoring it into

two rounds. So first, each vertex in V minus X runs a local tournament. To elect

a local winner, so each of these vertices in V minus X. Says, well let me look at

all of the edges. For whom I'm the head and also the tail of that edge is in X.

And amongst all of those edges that start in X and terminate at me, I'm going to

remember the best of those. So that's the winners of the local tournament of the

first round. And now the heap is only going to remember this set of first round

winners. Right, there's no point in remembering the existence of edges who

aren't even the smallest score that terminate at a given vertex, because we

only care about the smallest score overall. Now when you extract min from the

heap, that's in effect. Executing the second and final round of this knockout

tournament. So each of the vertices of V minus X has proposed their local winner.

And then the heat in an extract min just chooses the best of all of those local

winners. So that's the final proposed vertex that comes out of the heap. So the

point is that if we can successfully maintain these two invariants, then, when

we extract min from this heap, we'll get exactly the correct vertex, W star, that

we're supposed to add to the set capital X next. That is, the heap will just hand to

us on a silver platter exactly the same choice of vertex that our previous

exhaustive search through the edges would've computed. The exhaustive search

was just computing the minimum in a brute force way, in a single winner take all

tournament. The heap implemented in this way chooses exactly the same winner. It

just does it in this 2-round process. Now, in Dijkstra's algorithm, we weren't

supposed to merely just find the vertex W star to add to X. We also had to compute

its shortest path distance. But remember, we computed the shortest path distance as

simply the Dijkstra greedy score. And here the Dijkstra greedy score is just going to

be the key for this heap that's immediate from invariant number two. So we're using

the fact here that our keys are, by definition, just. The smallest greedy

scores are edges that stick into that vertex W STAR so again exactly

replicating. The computation that we would have done in the straightforward

implementation, just in a much slicker way. Okay? But we're adding exactly the

same vortices, in exactly the same order, and we're computing exactly the same

shortest path distances in this heap of notation, provided of course that we do

successfully maintain these two invariants throughout the course of the algorithm. So

that is now what I owe you. We have to pay the piper. We've shown that if we can have

a data structure with these properties. Then we can simulate the straight forward

implementation now I have to show you how we maintain these invariants without doing

too much work. All right. So maintaining invariant number one will really take care

of itself. Really sort of by definition the vertices which remain in the heap are

those that we haven't process ed yet, and those are the ones that are outside of

capital X. So really the trick is, how do we maintain invariant number two? Now

before I explain this let me point out, that this is a tricky problem. There is

something subtle going on. So as usual, I want you to think about this shortest path

algorithm at some intermediate iteration. Okay? So take a, take a snapshot. A bunch

of vortices have already been added to X. A bunch of vortices are still hanging out

in the heap. They haven't been added to X. There's some frontier, there's a, just

crossing, possibly in both directions. And suppose at the end of a current iteration

we identify the vortex W, which we're going to extract from the heap and

conceptually add to the set X. Now the reason things complicated is when we move

a vortex from outside X to inside X. The frontier between X and V minus X changes.

So in this picture, the old black X becomes this new blue X. And what's really

interesting about the frontier changing is that then the edges which cross the

frontier change. Now, there might be, there are some edges which used to cross

the frontier and now don't. Those are the ones that are coming into W. Those we're

not so concerned with. Those don't really play any role. What makes things tricky is

that there are edges which used to not be crossing the frontier but now they are

crossing the frontier. And those are precisely the edges sticking out of W. So

in this picture there are three such edges which I will highlight here in pink. To

see why it's tricky when new edges all the sudden are crossing a frontier let's

remember what invariant number two says. It says that for every vertex which is

still in the heap, which is not yet in X, the key for that vertex better be The

smallest Dijkstra Grady score of any edge which comes from capital X and sticks into

this vertex of V. Now in moving one vertex into X, namely this vertex W, now there

can be new edges sticking into vertices which were still on the heap. As a result,

the appropriate key value for vertices i n the heap might be smaller. Now the W has

been moved into X. And the candidates for the vertices in the heap whose keys might

have dropped are precisely those vertices on the other end of edges sticking out of

W. So summarizing, the fact that we'd added a new vertex to capital X and

extracting something from the heap, it's potentially increased the number of

crossing edges across the frontier, because the frontier has changed. And

therefore, for vertices that remain in the heap, the smallest greedy score of an edge

that sticks into them from the set X might have dropped. So we need to update those

keys to maintain invariant number two. Now, that's the hard part. Here's what we

have going for us. We've damaged the keys perhaps by changing the frontier, but the

damage is local. We can understand exactly whose keys might have dropped, so as

suggested by the picture, the vertices whose keys we need to update are precisely

those at the head of edges that stick out of W. So for each outgoing edge from W,

the vertex we just extracted from the heap, we need to go to the other end of

the edge and check if that vertex needs its key to be decreased. So here's the

pseudo code to do this. So when we extract the vertex W from the heap, that is when

we conceptually add a new vertex W. To the set X, thereby changing the frontier, we

say, well, you know, we know the only vertices that might have to have their key

changed, they're the ones on the other side of these outgoing arcs from W. So we

just have a simple iteration over the outgoing edges, W V, from the vertex V.

Now I haven't shown you any edges in the picture like this, but there might well be

some edges where the head of the arc V is also in the set X, is also already been

processes. But anything in X is not in the heap. Remember, the heap is only the stuff

outside of X. So we could care less about the stuff outside. Of the heat, for not

maintaining their keys. So we do an extra check. If the head of this edge is in fact

still in the heap, that is if it's not in X So i n the picture, for example, this

would be true for all three of the vertices that are on the other end of arcs

pointing out of W. And for each of these vertices V, we update its key. And the way

we're going to update its key is, we're just going to rip this vertex out of the

heap. We're going to recompute its key and constant time, and then we're going to

reinsert it into the heap. And since all heap operations take logarithmic time,

this key update will be logarithmic time. As an additional optimization, I wanna

point out that if one of these vertices V's key does change, it can only change in

one way. So remember, what is the key? The key is the smallest Grady Dijkstra score

of all of the edges that start next and stick into this vertex. So that's the

local tournament or the first round tournament happening at this vertex V. Now

the only thing which has changed. Before and after we added this vertex, W to X, is

that now one new edge is sticking into this vertex, V. All of the old edges

sticking into it from X are still sticking into it, and now there's one extra

candidate in its local tournament, namely this edge, WV. So either WV is the local

winner; either it has the smallest Dyxtra-Greedy score of them all. That

terminated this vertex, or it doesn't, in which case the previous winner is still

the new winner. So if that is, the new key value can only be one of two things.

Either it's the old key value--that's the case where this. Extra entrance, the edge

from W to V is irrelevant. Or, if it's changed, it has to have changed to the

[inaudible] score of this edge, W-V. And the formula for that is the shortest path

distance. That we just computed for W where W has been processed at this point

plus the link of the direct arch from W M V. And again conceptually this formula is

just a greedy Dijkstra score for the arc WV. The new entrance in V's local first

round tournament. So now, having updated V's key appropriately, so that invariant

#two is restored. And once again, the key of every vertex does reflect the sma llest

greedy, Dijkstra greedy score of any edge sticking into it from the set X. We can

safely reinsert this node back into the heap with its new key value. And these

three lines together are just a key update in logarithmic time, for one of these

vertices that's at the other end of an arc sticking out of the vertex W. So let's

tally up the running time in this new implementation. One thing I want you to

check, and this will definitely help you understand this refined implementation of

Dijkstra's algorithm, is that essentially all the work done is through the heap API.

That is, all of the running time that we have to account for is in heap operations.

We don't really do nontrivial work outside of heap operations. And again recall that

the running time of any heap operation is logarithmic in the number of elements in

the heap. Our heap is storing vertices. It's never gonna have more than N things

in it. So the running time of every heap operation is big O of log N. So what are

the heap operations that we do. Well, we extract men and we do it once per

iteration of the wild loop. So there's N minus one iterations of the wild loop,

just like before, but now instead of doing an exhaustive search through the edges, we

just do a simple extract men from the heap and it gives us on a silver platter the

vortex we should add next. So what do we do beside extract mins? Well, we have to

do this work paying the piper. We have to maintain invariant #two. And every time we

extract a min, that then triggers some subsequent key updates. And remember, each

of these key updates is a deletion of an element, from the heap followed by an

insertion. So how many deletions and insertions do we do? Well, at first this

might seem a little bit scary. Right? Because we do a roughly linear number of

extract mins. And a vertex might have as many as N-1 outgoing arcs. So it seems

like a vertex could trigger as many as N-1 key updates, which is theta of N

[inaudible] operations. And if we sum that up over the N iterations of the wild loop

that w ould give us N squared heap operations. So, and indeed, in dense

graphs, that can be the case. It is true that a single vertex might trigger a

linear in N [inaudible] number of [inaudible] operations. But that's the

wrong way to think about it. Rather than have this vertex-centric perspective on

what, who's responsible for heap operations, let's have an edge-centric

view. So for each edge at the graph, let's think about when can this be responsible

for some heap operations, in particular a decrease in key in the resulting insertion

and deletion. If you have an edge and it points from the vertex V to the vertex W.

There's actually only one situation in which this edge is going to be responsible

for a, a decrease in key. And that's in the case where the tail of the edge, V.

Gets sucked into the set X before the head W of this edge gets sucked into the set X.

If that happens, if V gets sucked into X and W is still outside of X, then indeed

we're gonna have to decrease the key of W, just like we did in the examples. But

that's all that's gonna happen: V can only get sucked into X once and never gonna

leave it. So it's only responsible for this single decrease in key of its head W.

And that's one insertion and one deletion. And in fact, if the endpoints of this edge

get sucked into X in the opposite order, if the tail of, excuse me, if the head of

this edge W gets sucked into X first. That doesn't even trigger a, a key decrease for

V, and V will never have its de key decreased, because of this particular arc,

from V to W. So the upshot is that each edge VW of the graph triggers at most one

insert delete combo. So what does this mean, this means that the number of heap

operations. Is big O of N, that's for the extract mins. Plus big O of M. That's for

the insert the leak combos triggered by edges during the decreased keys. Now just

to, I'm gonna write this in a, in a simplified way. This is just O of M, the

number of edges. And this is because of our assumption that's there's a path to s

from every other vertex. If yo u think about it that means that the graph is at

least weakly connected if you picked it up it would stay together in one piece. So

that means it at least contains a tree, at least an in an undirected sense, which

means it contains at least N minus one edges. So we're in the case of weakly

connected graphs where N dominates M. M is always as big as N at least up to a plus

one. So what that means is the running time of Dijkstra's algorithm, with this

heap implementation, is just a log factor larger. Remember, every heap operation

takes time logarithmic. So we do a linear in M number of operations; each takes time

logarithmic in N. So the running time is M log N. With, I should say, quite good

consistence. So this is a really, really impressively fast algorithm, for computer

such a useful problem as shortest paths. So we got a little bit spoiled in our

discussion of graph searching connectivity, where it seemed any problem

we cared about we could solve in linear time, over M plus N. So here we're picking

up this extra logarithmic factor, but I mean, come on, this is still awesome. A

running time of M log N is unbelievably faster than a running time of M times N,

which is what we had in the straightforward implementation. So this

deft use of the heap data structure has given us a truly blazingly fast algorithm

for an extremely well motivated problem, computing shortest paths.