So let's lets rap up our our results here and talk a little bit about what we've
seen. first like to graph the RMSE with
different methods that we have used in a few others ones that we didn't.
so here is the RMSE in the blue is the training data, in the red is the test
data. So you can see first is that just simple
average predictor, where we had 3.5 for all the entries.
And this is the baseline predictor, so you can see the improvement in both the
training and the test data down for the baseline predictor.
So, it's a huge drop when we actually incorporated that.
and then the neighborhood method, which was one.
which is what we, we had done. And so again, you see another drop there.
And then a few other things that we didn't look at any of these other ones,
these are the three that we did. We did the simple average, then we did
the base line and the first neighborhood. but a few things that we didn't explore.
First is that we could have used users as neighbors, and I don't have a graph of
that here but it's really the same thing. We just draw similarities along the rows
instead of along the columns. and another thing is we could have used a
neighborhood method with multiple neighbors.
And, you know, there's no telling which, how many neighbors is the correct one.
One way to do it is you'd run it for two different tries and then see which one
gives you the lowest RMSE on just your training data because remember you can't
look at your test data until you're all done.
but, and then you can see which one gives you the lowest on the training data but
again there's still no guarantee that it's going to be the lowest on the test
data. but this is showing if we had used two
neighbors, actually what we would have gotten, and the math gets a little more
complicated there. but you see it's a slight improvement
over the neighborhood method with one neighbor, but actually the with the test
data it actually jumps up, a little bit. so that's really interesting.
Now another thing is that we said that we could optimize the baseline values.
Remember we just took the average in the rows and columns and subtracted from the
overall average 3.5. but really we can, we can actually solve
an optimization problem to get the baseline lower and now we're lower and if
we had done that we would have gotten 0.397.
And 0.6311, and those were lower than these values over here which is what we
compare them to, these guys. And then if we had done the opt, the
neighborhood method with these baseline values, right, we would of dropped even
lower. And then you would actually see how the
neighborhood method with two neighbors is yeah it gets, gives you lower values in
both case than the neighborhood method with just one neighbor.
And so interesting to see how different decisions along the way, whether or not
you're going to optimize, can determine whether or not things are going to be
better or worse for you. So again you just go by intuition, what
we think is going to work. And we go with that.
And so, these are, these are really, I just want to emphasize, these are just
the key ideas, I mean, there is so much more to this.
It's such a vibrant and abundant field, in machine learning and data mining.
And there's courses, on courses that are specifically on machine learning, for
instance. And those are all really interesting
fields. And you know, in a future version of this
course, maybe we'll go into more depth with some of this stuff.
but this just sort of grazes the landscape, so to speak.
And you should have the key ideas now what Netflix does to recommend movies for
their users. So they exploit baseline and neighborhood
predictors as portions of it.