What about this term?

So this term actually it depends on Z since we have it here.

And those two, some of those would also depend on Z. All right.

So let's rewrite this formula here and see what we get.

So, it would be that expectation and with respect to Q of theta.

Sum over all documents.

Now we have also sum over all words.

And from one to D-N. And also the sum over all topics.

Sum over T from one to capital-T. All right,

the indicator that ZDN equals to

T. And now we have the logarithm of theta DT,

plus the logarithm of

FI TWDN, plus non-constant.

Okay. Now let's take the expectation and put it under the summation.

So expectation is taken with respect to theta.

This term doesn't depend on theta,

this doesn't depend either.

So we can take the expectation here.

All right. So, now I write sum over three variables,

sum over D, over documents, sum over words.

And some over topics.

The indicator, that does not depend on theta so we put the expectation further.

The expectation of the logarithm

of theta DT.

Plus the logarithm of FI TWDN.

And again plus non-constant.

All right. This is the logarithm of the distribution over Z.

Let us take the exponent of the left hand side and

the right hand side and we'll have the Q of Z.

Actually equals to the products since we take

the exponent the summations become products.

D from 1 to D. Product of N from one to ND.

Now, here is the trick. We know that Z,

sum of 21 over T. Since we assigned only one topic for a word.

And so since here we have summation and here we have the extra distribution over ZDN.

We can see that we can write down

the distribution Q of Z as a product of independent distributions.

So it would be Q of ZDN.

And Q of ZDN can be derived using this term. Let's do it.

The probability is that ZDN equals to

T would be proportional to the exponent of this term.

So we can write down it as follows.

So, it is FI TWDN,

times the exponent of the expectation.

Here we can write

down the expectation only with respect to theta DT since the thing

that we're going expectation of depends only on theta DT.

So, Q of theta DT of the logarithm of theta DT.

What would be the memorization constant?

So actually this thing should sum up to one.

So we can compute the summation over the numerator with respect to

all possible values of T. And they're only capital-T possible ways,

possible values of T. So,

this would be t from 1 to capital-T. Of the same thing.

So that down is prime here.

FI T prime WDN.