0:33

Then true and false, refer to the true state of the world.

So, true means that you actually belong to the class we're trying

to identify, and false means that you actually don't belong to that class.

So as an example, true positive mean, the true part of true positive, means that you

were correctly, so in other words that, the

truth is that there actually was something to identify.

A positive.

In other words, we actually identified you as being belonging to that class.

Similarly for a false positive, the positive part

again refers to the fact that we identified you

as being part of the positive class, and

false refers to the fact that you were wrong.

We didn't actually corr, classify you to the correct class.

To make this a little more concrete, consider a medical testing example.

So in this case, we're trying to identify

people that are sick using say a screening test,

a very common example would be, say mammograms

to try to identify if women have breast cancer.

In this case, the true part will be the status as to whether you're sick or not.

So if we say that we truly identified you, then you were truly sick.

And if we falsely identified you, then you were actually healthy.

You were not truly sick.

1:41

So in this case, a true positive, is truly, somebody who is truly sick.

And it's positive, in other words, we actually

diagnosed those people as correctly as being sick.

If you're a false positive it means that,

false, in other words you are a healthy person,

but positive, means that we were still somebody that

we identified as being sick, even though you weren't.

2:02

Similarly with a true negative.

This is somebody true, who is truly negative,

truly healthy, and we identified them as being negative.

And a false negative would be somebody who is sick, so we incorrectly identified

them as healthy, and the negative part of is we identified them as healthy.

You can learn more about sensitivity and

specificity by going to this Wikipedia link below.

You can also see them in this 2 by 2 table.

So it's called a 2 by 2 table, because it has two rows, here, and two columns, here.

So, the columns correspond to what your disease status is.

So, in this, in this particular example, positive means that you

have the disease, and negative means that you don't have the disease.

That's the real truth about your disease status.

And the test is our prediction, our machine learning algorithm.

A positive means we predict that you have a disease and

a negative means that we predict that you don't have the disease.

So some of the key quantities that people talk about, are the sensitivity.

This is the probability that we give you

a, predict that you are diseased, given that

you really are diseased, so, if you're really

diseased, what's the probability we get that, right?

And then the specificity is if you are

really healthy, what's the probability we get it right?

3:11

The positive predictive value is the probability that we call you diseased,

or the probability that you are diseased, given that we call you diseased.

So it's a little bit different, than the sensitivity in the sense that, now it's

looking at all the people we called

po diseased, and saying, what fraction of them.

Actually are diseased.

Similarly for the negative predictive value.

And the accuracy is just a probability

that we classified you to the correct outcome.

So in this table, it's the terms on the diagonal.

It's the true positives, and the true negatives, just added up.

3:41

So you can write these as fractions.

So for example, the sensitivity.

That's the probability, given that you are diseased, that we called you diseased.

So we look at this first column.

This is all the people that are diseased.

And we look, what fraction of them, did we actually get right.

So that's, the true positives, divided by the true

positives, plus the false negatives, that gives you the sensitivity.

You can similarly make the same sort of fractions for the

specificity, the positive predictive value, the

negative predictive value, and so forth.

When looking at the positive predictive value.

We basically look at the, in this case it's the

true positives, divided by the true positives plus the false positives,

because we're looking at only the positive tests, and we

say what fraction of the positive tests did we get right?

So the true positives were the ones that we got right, and the

true positives plus the false positives, is the total of the positive tests.

4:32

So this is kind of important because, many prediction problems,

one of the classes will be more rare than the other.

So, for example in, in medical studies, it's very common

that only a very small percentage of people will be sick.

In this case, suppose that there's a disease where only

one, 0.1% of the people are sick in the population.

And suppose we have a really good machine learning algorithm.

A really good testing kit, that is 99% sensitive, and 99% specific.

In other words, the probability that we'll get it right, if you're diseased

is 99%, and the probability we'll get it right if you're healthy is 99%.

So in this case, suppose that you get a positive test.

5:29

So the general population, remember, we only have

about 1% of the people that have the disease.

So there are only 100 people in this column, that have the

disease, sim, but there are a lot more people that are healthy.

Similarly, we have a 99% accuracy, if you have the disease.

So, 99 out of 100 people.

And 99 out of these 100, are correctly called diseased.

Similarly, at, among the people that are healthy, we get 99%

right, so 98,901 we call healthy, when they really are healthy.

That's 99% of the time.

But suppose that we wanted to know, if you got a

positive test, what's the probability that you actually have the disease?

So, let's look at this for a second.

So you say.

Suppose you actually got a positive test, that's this first row right here.

What's the probability that you actually have the disease?

So that's, the number of people that actually have the disease, among

the total number of people who had a positive test, so that's 99.

Divided by 99 plus 999, so it's only a 9% positive predictive value.

In other words, if you got a positive te, test, it's

only about a 9% chance that you actually have the disease.

What's the reason for that?

The reason is 99% of a small number, so 99 out of 100.

Is still smaller than 1% out of a much bigger number.

So 999 out of a much larger fraction that are actually healthy people.

If instead we consider the case where 10% of people are actually sick,

then you have a much larger number of people that are actually sick.

And 99% of the time, we'll get it right, so 99, 9,900 of people, that actually are

sick, we'll call sick, and only 900 of

the people that are healthy will be called sick.

And so then, things work out how you'd expect them to.

In other words, 9,900 out of 9,900 plus 900, so that's.

This number on the top left-hand corner.

Divided by this total row.

Is 92%, and so you have a high positive predictive value.

What does this mean?

If you're predicting a rare event.

You have to be aware of, how rare that event is.

This goes back to the idea of knowing what population you're sampling from.

When you're building the a predictive model.

7:39

This is actually a key public health issue, so you've probably

seen it in the news that there's been questions about how,

what's the value of mammograms in detecting disease, and detecting the

value of disease versus detecting cases that aren't necessarily life threatening.

Similarly, you've probably heard about it for prostate

cancer screening, and in both of these cases.

You have a fairly rare disease, and even though the

screening mechanisms are relatively good, it's very hard to know whether

you're getting a lot of false positives that are, as

a fraction of the total number of positives that you're getting.

For continuous data, you actually don't have quite

so simple a scenario, where you only have

one of two cases, and one of two types of errors that you can possibly make.

The goal here is to see how close you are to the truth.

And so, one common way to do that, is with something called mean squared error.

And so the idea is, you have a prediction that

you have from your model or your machine learning algorithm.

And so, you have a prediction for

every single sample that you're trying to predict.

And you also maybe know the truth for those samples, say in a test set.

So what you do is, you calculate

the difference between the prediction and the truth.

And you square it, so the numbers are all positive.

And then you average the total number of, sort

of total distance between the pre, prediction and the tree.

The one thing that's a little bit difficult about

interpreting this number is that you squared this distance,

and so, it's a little bit hard to interpret

on the same scale as the predictions or the truth.

And so what people often do is they take the square root of that quantity.

So here, underneath the square root sign, is the same number, it's just the average

distance between the prediction and the truth, and you just sum it and square it.

And then you take the square root in that number,

and that gives you the root, root mean squared error.

And this is probably the most common error measure that's used for continuous data.

So for continuous data, people often use either

the mean squared error, or the mean squared error.

But if often doesn't work when there are a lot of outliers.

Or the values of the variables can have very different scales.

Because, it will be sensitive to those outliers.

So, for example, if you have one really, really large value.

It might really raise the mean.

Instead, what we could use is often the median absolute deviation.

So in that case, they take the median

of the diff, distance between the observed value,

and the predicted value, and they do the

absolute value instead of doing the squared value.

And so again, that requires all of the distances to be positive,

but it's a little bit more robust to the size of those errors.

9:56

And then sensitivity and spe specificity are very commonly

used when talking about particularly medical tests, but they also

are particularly widely used if you care about one

type of error more than the other type of error.

And then, accuracy which weights false positives and false positives equally.

This is an important point if again you have a very large.

Discrepancy in number of times that you're a positive or a negative.

For multiclass cases, you might have something like concordance,

and here I've linked to one particular distance measure, kappa.

But there are a whole large class of distance measures, and they

all have different properties, that can be used when you have multiclass data.

So, those are some of the common error

measures that are used when doing prediction algorithms.