0:00

. As we'll see, in little bit later in the

lecture the way that we're going to set up our graph processing algorithms, is to

develop an API to cover our representation of the graph and provide simple set of

methods for clients to call to process the graph.

So, let's take a look at that in detail. So, the idea is we have to represent, a,

graph within the computer. One of the first things, to remember is

that, you can, draw a graph, that maybe, that provides some kind of intuition,

about the structure, an, but you could have two drawings that represent,

represent the same graph that look pretty different.

And so one of the things to remember in any graph representation is, it can give

you some intuition. But that intuition may be misleading.

And we'll just remember that as we look at different representations.

So first thing is how to represent the vertices.

So what we're going to do in this lecture is just use integers between zero and V

minus one. The reason we do that is it allows us, to

use vertex indexed arrays, within the graph representation.

And we understand, from earlier algorithms, lectures, that we can use a

symbol table to convert names to integers, with a symbol table.

And so, we'll, leave that part as a symbol table application.

And, just work with graphs with vertex names between zero and V minus one where v

is the number of vertices. Now we have to remember when we're doing

representation we can have various anomalies in the graph.

So we can we draw edges, but actually, in real data we might have multiple edges or

we might have a self loop. And we'll take that into account when we

look at the representation, graph representation.

So here is the API that we're going to use.

So again our graph processing algorithms are going to be clients of this API.

The idea is that will use this to build graphs.

And then our processing programs will be client at this program,

In this, of this API. And the idea is most of them have a very

simple operations that they need to do, and those are the one's that we put in the

API. So we have two constructors.

One that creates an empty graph would be vertices.

Another one that creates a graph from an input stream.

Then the basic operation for building a graph is just add edge, so that adds an

edge connecting two vertices, V and W. The basic operations for processing our

graph, well there's V and E that gives a number of edges and vertices.

But then there's an iterator that takes the vertex's argument,

Then iterates through the vertices adjacent to that vertex.

All of our graph processing can be cast in terms of this iterator.

So down here is an example of a client program that prints out every edge in the

graph. So we created an input stream, maybe with

a given final name, create a graph from that input stream,

And we'll look at the code for that. And then the client program for every

vertex, So remember the vertices are numbered

between zero and G dot the number of vertices minus one.

So for every vertex, we iterate through all the vertices adjacent to that vertex

and print out an edge, if, if there's an edge connecting v and w we print out v and

then a little dash to indicate an edge and then w.

This actually prints out every edge twice and in an undirected graph because if the

v and w are connected by an edge then w appears in v's adjacency list and v

appears in w's adjacency list. So heres an example of a running that

client, if, if we have a file tinyG.txt, our standard is we have a number of

vertices as an integer in the first is the first integer in the file.

Number of edges is the second integer in the file and then pairs of vertex names.

4:55

And so. The constructor will read those two things

and then call that edge for all of these pairs of things and that enables this

client, to, if we run it for that graph, to print out all the edges.

So everybody adjacent to zero, 61 and five, everybody adjacent to one is just

zero, So you notice the edge 0-1 appears twice in the list.

So that's the sample client of our basic graph in API.

And so here's some typical and simpicle, simple graph processing code that uses the

API. So you can write a static method that

takes a graph and a vertex as argument. And returns the degree, the number of

edges that are connect, number of vertices that are connected by an edge to V.

So all it does is set a local variable degree to zero, And then iterate through

all the vertices adjacent to V and increment that and return it.

Similarly, you can compute the maximum degree of a vertex in a graph.

And that's for every vertex, compute the degree and find the biggest one or the

average degree. Well, the average degree, if you think

about it, it's just twice the number of edges divided by the vertex or you could

go through all the way through every vertex and edge and compute the total and

divide, but this is probably a much more efficient way to do it or say number of

self loops. and so that involves going through the whole graph of every vertex

for every edge adjacent to that vertex you check whether it's v and we've got a self

loop. And if it does then your return the number

of self loops that divided by two because every edge is counted twice.

So those are examples of static methods that a client might use.

In just example of the use of the ATI. So now how we going to implement that?

That's our usual standard of lets look at some clients and now let's talk about a

representation that we can use to implement the graph API.

So one possible representation is, set of edges representation, where, for every

edge, we just maintain a list, maybe an array of edges or a linked list, of edges.

So for every edge in the graph, there's, a representation of it.

And that one is a possible representation, but it leads to, inefficient,

7:49

implementation, much less efficient, that would make it unusable for, the huge

graphs that we see in practice. Another one is called the adjacency matrix

representation, Here we maintain a 2-dimensional v x v

array, It's a boolean array, 0-1 or true or

false. And for every edge v-w in the graph you

put true for row v in column w and for row w in column v.

So it's actually two representations of each edge in an adjacency matrix graph

representation. S, you can immediately, give a v and w

test whether there's an edge connecting v and w.

But that's one of the few operations that's efficient with this representation

and it's not very widely used, because for a huge graph, say with billions of, of,

vertices, You would have to have billion squared

number of entries in this array, which is going to be too big for your computer,

most likely. So this one actually isn't that widely

used. A, The one that is most widely used in

practice, and the one that we'll stick with, is called the adjacency list

representation. And that's where we keep a vertex index

array where, for every vertex, we maintain a list of the vertices that are adjacent

to that. So, for example, vertex four, Has, five,

is connected to five, six, and three, so its list has five, six, and three on it.

Now, on lower level representations we've talk about using linked width or array for

these, But actually in modern lingo with the

background that we'd built with algorisms what we're going to use is an abstract

data type. Our bag representation for this, which is

implemented with a linked list, but we don't have to think about it when we're

talking about graphs. we'd keep the vertices,

Names of, numbers of the vertices that are adjacent to each given vertex in a bag.

And we know that we can implement it, such that we can iterate through and time

proportional to the number of entries and the space taken is also proportional to

the number of entries,, And that's going to enable us to process

huge graphs. So here's the full implementation of the

Jason C, List graph representation.

So, the private instance variables that we're gonna use area the number of

vertices in the graph and then a array of nags of integers. so data the types and

set of values, set operations on those values, so those are the sets of values

for a graph. So here's the constructor of an empty

graph with V vertices. We keep the value v in the instance

variable as usual. Then we create.

An array of size V. And, of bags of integers.

And, store that array in, [inaudible] as the adjacency array of the graph.

Adjacency lists array. And then, as usual.

When we create, a, a, an array of objects. We go through.

And for every entry in the array, we initialize with, a, empty object.

So, after this code, we have the empty bags.

And so that's the constructor and then the other main engine and building graphs is

at edge and so, to add an edge between V and W, we add W to V's bag, and we add V

to W's bag. That's it.

And to iterate through all the vertices adjacent to a given vertex we simply

return the bag which is iterable. This is a nice example, illustrating the

power of abstraction. Because we did the low level processing,

for, that, that's involved, with our bag implementation in one of the early,

lectures. And now we, we get to use that to give a

very compact implementation, and efficient implementation, of the, graph data

structure. So it's really important to understand

this code. And you should make sure, that you study

it. So, as I mentioned in practice.

We're gonna be using this adjacency list representation.

Because all the algorithms are based on iterating over the vertices adjacent to V.

And this gets that done in time proportional to the number of such

vertices. So that's number one and number two, in

the real world, the graphs have lots of num, lots of vertices,

But a pretty small vertex degre. so number one, we can afford to represent, represent

the graph when we use the adjacency list representation because basically our space

is proportional to the number of edges. And number two, we can afford to process

it because our time taken is proportional to the number of edges that, that we

examined, with the ray of edges representation in the adjacency matrix

representation it gets very slow for some very simple task,

But, mostly, it's very slow for iterating over the vertices given to, adjacent to a

given vertex. Which is the key operation.

A list of edges, you have to go through all the edges to find the ones adjacent to

a given vertex. Adjacency matrix, you have to go through,

all possible, vertices adjacent and that's just going to be much too slow in

practice, Because adjacency list gets it done, in

time proportional degree of v, which is much smaller, for the huge graph that we

see in the real world. So that's our basic API.

Next, we'll look at some algorithms that are clients of this API.