And we'll go to our familiar example which is linear models.

The subject that we've been covering the entire class up to this point.

So, in this case we are assuming that our Y is normal with the mean Mu and

this happens to be an exponential family distribution and

then we're going to define the linear predictor, this a to i, okay,

to be the collection of co variant axis, times their coefficients, beta, okay?

And in this case,

our link function is just going to be the identity link function.

It just says that the mu from the normal distribution is exactly this collection,

this sum of co variants and coefficients.

And so this just leads to the same model that we've been talking about along.

We could just write it again as Y, Yi is equal to

the Mu component which is summation xi beta plus the error component.

So, we've simply written out the linear model in a separate way.

We've talked about.

The fact instead of saying, the error is normally distributed we say,

the why is normally distributed, which is a consequence of our other specification.

We specify a linear predictor kind of separately.

Okay. And then we just connect

the mean from the normal distribution to the linear predictor.

And you might think this seems like a crazy thing to do

when just righting it out as an additive sum with additive errors.

Seems like such an easy thing to do but

as we move over to different settings like the plus sign and the binomial setting

it'll be quite a bit more apparent why we're doing this.

So let's take in my mind perhaps the most useful variation

of generalized models logistic regression.

So in this setting we have data that are zero one so binary.

And so it doesn't make a lot of sense to assume it come from a normal distribution.

So the natural, the only distribution available to us for coin flips for

zero one outcomes is a so called Bernoulli distribution so we're

going to assume that our data, our outcome Ys, follow a Bernoulli distribution,

where the probability of a head, or a so called expected value of the y, is mu y.

Okay, so we're modeling our data as if it's a bunch of coin flips.

Only the probability of a head may switch from flip to flip,

and it's necessarily .5.

Okay, so the probability is given by this parameter Mu sub i.

The linear predictor is still the same.

It's just the sum of the covariance times the coefficients.

Now the link function in this case, the most famous and most common one,

is the so called logistic link function, or the log odds.

So in this case the way in which we're going to get from the mean from

the probability of the head to the linear predictor is to take the log of the odds.

So the odds are, the probability over one minus the probability, so

in this case we have written it is mu over one minus mu.

We're going to take the natural logarithm of it.