Now, this is an important topic in information retrieval, in database
management, we just won't have time to talk about it.
But we know that for any reasonably popular search terms, there will be many
relevant webpages. And then, what matters in deciding which
webpage will show up on page one of the search result often boils down to the
difference in the importance score of this eventually composite of two scores.
So, it is the important score determined by the connectivity pattern of the web
page graph that will often determine the final ranking of the top, say, ten or 50
or so web pages. And before we can go further, we also need
to talk a little bit about a notation and operation of linear algebra objects.
Now, I know that some of you may not have taken linear algebra, some of you may have
forgotten about it. In 2013, May, we will have a summer
course, a version for this very course where we will have no need for any linear
algebra terminology. It is possible to talk about everything we
are going to talk about without using the language of linear algebra.
However, that discussion will take a much longer time because without this shorthand
notations, we call vector and matrices, it will be very hard to explain every step.
And a lot of the existing results from linear algebra concerning matrices we
won't be able to use. So, in interest of time, we will have to
then shrink the discussion to a much smaller subset of what we will be talking
about in this course. So, I just want to highlight that it is
possible to talk about these things without the machinery of linear algebra
which we will indeed be doing in May 2013. But, in this version of the course, we
will be using linear algebra. It is a very useful shorthand notation
plus a powerful set of machinery that will allow us to accelerate the discussion.
Now, I'll try to make this as non linear algebraic as possible, but at some point,
we will have to do a little bit of that kind of math, okay?
So, if you need to tune out feel free. But, you may still be able to get some of
the interesting ideas even though some mathematical operations may be beyond your
understanding at this point. So, we've been using vector quad-quad law,
right? This vector, pi is a vector.
And by convention, a vector is a stacked-up column, okay?
Pi one, pi two, dot, dot, dot, pi n, say, pi 100.
What's a matrix? A matrix is really, in general, an object
with say, n rows and m columns. So, we can think of that as many vectors
arranged in parallel, or we can think of that as many rows stacked up, okay?
Either way. Let's say, we've got a matrix A that we
can denote this matrix A's ij-th at entry as A sub ij, okay?
In all of the material that we will be talking about this is just a real number.
It could be positive, it could be negative, in today's lecture it is never
negative. And, sometimes we use a little arrow on
top of a big letter to represent that it is a matrix.
A matrix multiplying a vector is a funny operation.
Say, say if matrix A, okay? Multiplies a vector x.
Say, matrix A is two by two. Say, it is one, three, four, two, okay?
And, vector x is just x1, x2. What you get is the following, okay?
First of all, the matrix is two by two. The vector is two by one.
So, what you get is a two by one, two by one vector.
The first entry is the inner product of the first row vector of the matrix and
this vector x. So, that is x1 plus three times x2.
One times x1, three times x2. The second entry of the vector is the
second row of matrix eight times the same vector x.
So, we have four times x1 plus two times x2, okay?
So, this is the basics of matrix multiplication.
Now, somehow in this lecture only, because of the research communities convention of
writing these quantities, they actually, instead of writing a matrix multiplying a
vector, they write it as a vector multiplied by a matrix.
Now, you may say, this doesn't even make sense cuz the vector is, let's say, you
know, two by one. The matrix is two by two.
You can't multiply two by one on the right by two by two.
Indeed, that's true. So, what we need to do is to flip this
vector into a row vector. For example, in this case will be x1, x2.
This is one by two row vector times the matrix which is one, three, four two,
that's matrix A. Then, it's okay cuz x transpose with this
big T on top is flipping the column vector into a row vector.
And now, you have one by two object timea two by two object and that makes sense
because this is two and this is two, the dimensionalities actually match.
So, in today's lecture, we'll be multiplying the vector pi on the right.
So, we have to write as pi transpose multiplied by different matrices on the
right. This is just an, perhaps unfortunate
convention of the notation in this specific research community.
So, we talked about matrix multiplication, right?
And, now, think about the following. Suppose I've got a vector pi here and
making it transpose, so it is a row vector, and multiplied on the right by
some matrix A. Suppose after being multiplied by this
matrix A, what you get is actually some constant, say half, times the same row
vector, pi transpose. Then, in that case we'll say, this vector
pi transpose is an Eigen vector. Means that, it maps back to itself
corresponding to this matrix A, okay? It might be a scaled version of itself,
every entry in the vector is scaled by some number.
And this scaling factor is called the Corresponding Eigenvalue.
In today's lecture, we'll be constructing three different matrices.
Start from the simplest one based on our current intuition and then point out why
that doesn't suffice, and then suggest an enhancement, that's why we're going to go
through a sequence of three matrices. And the final product is a matrix, what we
call the Google matrix G, that captures the connectivity pattern of these webpages
on the web. And what we'll see that this matrix G is
the so-called non- negative matrix. Meaning that, each entry in G, G sub ij,
the i-th row and j-th column entry of this matrix G.
For all the i's and all the i's, meaning, every entry in this matrix G must be a
non-negative number. It could be zero, but can't be negative.
And it turns out that there's a whole branch of mathematics in linear algebra
that studies non-negative matrices, with many properties, okay?
And that's often called Perron?Frobenius Theory of non-negative matrices.
One of these results says that there will be largest eigenvalue which is exactly
what? In other words, we can find an eigenvector
associated with this eigenvalue for a given matrix, say A, such that pi
transpose multiplied on the right by A returns nothing but the same vector pi
transpose. Why you can also write down this one in
front of it. It's okay.
That is the eigenvalue, okay? Now, you may say why are we even going
through all these matrix multiplication, and eigenvector, and eigenvalues for
non-negative matrices? What has this got to do with Google's
webpage ranking? As we will see in the next 30 minute or
so, that the ranking is determined by the important score vector pi.
And pi really is nothing but the eigenvector associated with eigenvalue of
one of a matrix, this Google matrix. Okay, instead of writing a generic in
matrix A let's write the Google matrix G. In other words, will we go through a
sequence of steps to properly define matrix G such that the ranking, important
score, pi vector, can be computed as the dominant eigenvector.
The eigenvector associated with the largest eigenvalue, which happens to be
one, of this matrix G. And from this pi, we can then compute, we
can then order that web pages to be displayed.
So, that's the game plan for the rest of the lecture.
And I would start with nothing but our same small example.
Nodes one, two, three, four with six directed lengths.
Now, we already know the answer, but we're going to now derive that answer using a
matrix notation. And in the rest of this course, we will
see quite a few other matrices, especially around Lecture eight.
All right, so the very first matrix we'll construct to represent this graph is a
matrix we denote by H, okay? Before we define the ij-th entry of a
generic matrix H for any general hyperlink to webpage graph, let's just look at this
specific example. I'm going to define the corresponding
matrix H which is four by four cuz there are four nodes as the following.
Look at node number one, that will give me the first row of this matrix.
Number one is pointing to number three and no one else.
So, I want to capture that information. So, one doesn't point to, back to self,
one does not point to node number two. Let me just label these columns and the
rows. But, it does point to node number three.
It does not point to node number four. See, just by writing down zeros and ones
in this row, I am encoding, representing the graph, at least part of the graph,
okay? What happens with node one's outgoing
links? Now, you may remember, we also need one to
normalize by the number of outgoing links was already captured here.
There's only one nonzero entry here so the number of outgoing links is one.
Now, we could write one divided by one, which is one, okay?
That's fine. What about node number two?
By, by symmetry it's actually the same as load number one so, zero, zero, one over
one, zero. This one is an indicator to indicate that,
hey, there is a link from node two to node three.
This one is a counting variable. It says, I count there is only one link
coming out of node two pointing to some other node.
So, actually, the function of this one is not the same as the one in the
denominator. What about node number three?
Y only points to four so you know the answer is zero, zero, zero, and a one,
divide by one. What about node number four?
Now, this gets more interesting. Four points to the other three nodes,
except itself, so this is zero. And so, you can see the diagonal of this
matrix zero, because no node points back to itself, but points to everyone else for
node four, so we'll have to write out one, one, one.
Now, we can't just write it as such but to make the computation easier to carry out,
we're going to incorporate this normalisation right within the definition
of this matrix H. We'll say, since there are three outgoing
links, let's represent the normalization by dividing each entry with three.
So, we've got one over three, one over three, one over three.
Now, you can see immediately that every row of this matrix H sum up one, okay?
Well, this is by definition by design of this matrix H, we constructed it such that
it sum up to one, unless this node points to no other webpages.
We'll come back to that in the next module of the lecture video, okay?
For this graph, we are fine, okay? So, you may have a lot of zeros, which is
actually the real case because the real web is [unknown].
And then, you have non-zero entries, which are actually identical.
They are basically all just one divided by the number of outgoing links.
And then, of course, obviously they will sum up to one per row.
This is how we construct this matrix H. You say, what's the point?
I mean, I can see you're capturing the connectivity pattern of the graph
algebraically in a bunch of numbers in a matrix.
But, why? Why would you want to do that?
Well, because, this will facilitate, accelerate our discussion.
For example, now when I say, I want to write down the relationship of these nodes
important scores, for example, okay? Pi three equals pi one plus pi two plus pi
four, okay? Over three, because pi four is split
across three ways. I can write this down by staring at the
graph. But, of course, when the graph becomes
big, for a computer to stare at it is actually staring at an algebraic
representation of the graph, okay? That gives you everything you need to know
about the graph. And as you can see, what's happening is
the following. Just look at this column, not the row.
I know we constructed the graph row by row, but now look at each column.
It says that pi three really is pi one plus pi two plus pi four times one over
three. In other words, if I give you a row
vector, pi one, pi two, pi three, pi four and I ask you to multiply with the third
column, column of this matrix, which is one, one, zero, one over three.
This is what you get, right? Pi one plus pi two plus pi four over
three. Aha.
If that's the case, then actually, in order to compute these pis, all I need to
do is to multiply this row vector on the right by the corresponding column of this
matrix H. So, in general, what we can do is to say,