This segment will look at a binomial beta conjugate analysis using Excel and looking at using Excel for questions about distributions. Up at the top of this file, I post the scenario and some questions Suppose we were giving two students a multiple choice exam with 40 questions where each question has four choices. We don't know how much students have studied for this exam, but we think that they will do better than just guessing randomly. Based on the scenario we can ask a number of questions. What are the parameters of interest? What is our likelihood? What prior should we use? What is the prior probability that data parameter would be greater than one-quarter, one-half or 0.08? Suppose the first student get 33 of 40 questions right. If they have parameter theta one, what's the posterior distribution for this parameter? It's a posterior probability that it's greater than one quarter, one half, or 0.8 and what's 95% credible interval for it. Suppose the second student gets only 24 questions right, what's the posterior distribution for their parameter? What's the posterior probability it's greater than 0.25 0.5 or 0.8? And what's the 95% posterior credible interval for it? And finally, what's supposed to be your probability that theta 1 is greater than theta 2, that is, that the first student has a better chance of getting a question right than the second student. First two questions are more theoretical questions. Our parameters of interest are theta 1 being the true probability the first student will answer the question correctly. And theta two being the true probability the second student will answer the question correctly. We're going to use a binomial likelihood with parameters n equals 40 and probability given by the appropriate theta. Here, we're assuming that each question is independent. And that the probability a student gets each question right is the same for all questions for that student. Now, how do we decide what prior we should use? For a binomial likelihood, the conjugate prior is a beta prior. Let's explore some possible beta priors. I'm going to setup some columns here in the spreadsheet. So then we can look at plots of possible beta priors. So in column B, I'm going to have possible values for theta. We'll look at the prior distribution f of theta. We'll be looking at the likelihood for theta1. The posterior for theta1. The likelihood for theta2. And the posterior for theta2. For theta we're going to have values going from 0.01 up to 0.99. We can use the fill function. Edit, Fill, Series to go down the column in increments of 0.01 going up to a value of 0.99. I'm going to set this up so that we've got some flexibility in the formula. And the two parameter that should obey the distribution are often called alpha and beta. And let's start with the standard uniform distribution, the default prior so that's values one and one in our prior beta distribution. Prior density we can stick into column C. There isn't a built in function in Excel for beta density so we'll write this out by hand using the factorial function to get the gamma functions. We'll paste this in to the first cell and then we'll copy it to the remaining cells. We can now plot this using a line plot. This is a uniform distribution. All probabilities between 0 and 1 are equally likely. The standard default distribution for a prior but it doesn't encode our information in this case, that we think that they will do better than just guessing randomly. So we need a prior that will put its mass above 0.25. For example, we might think that our prior probability should be two-thirds. So let's consider some possibilities for priors, with prior being two-thirds. One example would be an alpha of four and a beta of two. So now, because I've set it up as a formula, Excel automatically updates it and updates the plot. You can see here the mass has moved to higher probabilities. But at 0.25 there's still a fair amount of probability that's below 0.25. So, the larger we make these parameter values, the more concentrated the distribution gets. Let's try something a little more concentrated, for example, 8 and 4. This is now looking good, because at 0.25 there is very little probability that this distribution is below 0.25 and it's fairly nicely spread between 0.25 and 1. So for the rest of this piece we'll use a prior distribution, that's the beta, with parameters 8 and 4. Let's find the probabilities that the prior distribution will be above 0.25, 0.5 or 0.8. We can use the beta disk function, this is the cumulative distribution function for beta distribution. We call it cumulative distributions to probability the distribution is less or equal to that value so if we want to greater than we'll take one minus. So the prior probability that a theta will be greater than one-quarter is 0.9988, almost 1. Probability that it's greater than a half is 0.887. And the probability that it's greater than 0.8 is 0.16. Now suppose the first student takes the exam and gets 33 of 40 questions right. Posterior distribution will be a beta distribution of parameters 8 plus 33, and 4 plus 40 minus 33, or 41 and 11. We can think about what are the posterior mean and maximum likely estimates. Posterior mean for theta 1 is 0.788 the maximum likely estimate is 0.825. Recall the prior mean is two-thirds and so you can see that the posterior mean is in between the prior mean and the data estimate or maximum likelihood estimate. Let's go ahead and plot the likelihood and the posterior. The likelihood we can get the binomial distribution. And in Excel, we can get the density by setting the cumulative equals false. The posterior density is another beta distribution, and that one we'll need to calculate by hand. I'll take these two values, copy them down into the rest of the chart. So now I can plot the likelihood. And plot the posterior. I'm putting them on different charts because they're on different scales and so they don't show up well all on the same chart. But we can line them up nicely here and see that in fact, the posterior Is close to the likelihood, and in between the likelihood and the prior. The posterior's closer in shape and location to the likelihood, because the likelihood has more information in it. Recall here we have 40 data points in the likelihood, and the prior has an effective sample size of 8 plus 4 or 12. So there's more information in the likelihood than there is in the prior and thus the posterior looks more like the likelihood than it looks like the prior but is somewhere in between. Posterior probabilities that theta one is greater than quarter half of 0.8 Those are essentially 1.99999 and 0.4444. You can see that masses accumulated up here between 0.6 and 0.9 are so. And so, we're almost certain that theta 1 is greater than half. And there's a reasonable chance it's greater than 0.8, which is a big change from the prior. We can look at equal tailed 95% credible interval. Using the beta inverse function and we can see that there's a 95% posterior probability that the true value of theta 1 is in between 0.669 and 0.887. And that makes sense relative to this plot. Now let's consider second student. Second student got only 24 of 40 questions right. And so their posterior distribution will be a beta with parameters 32 and 20. We can look at the posterior mean and max likelihood estimate for theta2. Recall the prior mean is two-thirds. The posterior mean is point 0.615. The maximum likely to estimate is point 0.6. So again, here they are closer, but the posterior mean is in between the likelihood for beta estimate and the prior estimate We can put the likelihood for theta 2 into column F here. And put the posterior for theta 2 in column G. Now take these, and we'll paste it down in spreadsheet, and we'll make plots of them as well. The likelihood for theta 2. And the posterior for theta 2. So in this case, the posterior is a little bit lower than it was for theta 1, which makes sense because the second student didn't get as many questions right. Again it's much closer to the likelihood than it is to the prior but the posterior is in between. And you can see that it's going roughly from about 0.45 to 0.8. So we can again ask questions, what are the posterior probabilities that theta 2 is greater than one-quarter, one-half or 0.8 as well as what's a 95% credible interval for theta 2? We can see, posterior probably is greater than a quarter Is essentially 1, there is a 95% probability that it's greater than half but the probability that theta 2 is greater than 0.8 is very small, 0.00125. That make sense from the picture. Theta 2 is not likely to be as large as theta 1. 95% credible interval goes from 0.48 to 0.74. It is worth noting the credible interval does overlap with the credible interval for theta 1. So what is the posterior probability that theta 1 is greater than theta 2? Well we can do this by drawing random samples from each of them and see how often we observe theta 1 greater than theta 2. So let's put theta 1 in column H and theta 2 in column I and we'll draw some random samples here. Oops. When we draw samples using the inverse CDF method for drawing random values. So when draw random uniform put it into the beta inverse function and that will give me a draw from the beta distribution. For theta 2, same idea except with different data parameters. And finally I'm going to ask Is theta one larger than theta two. So I'm going to set up an indicator function for theta one larger than theta two and put that into column J. Now we can take this and copy this formula down and fill up 500 rows of this. And we'll use that to estimate by simulation, so this is a Monte Carlo experiment, and we'll just estimate using the empirical probability that theta one is greater than theta two. So we'll get 500 samples, and for the indicator function we'll add up all 500 of them divided by 500 and that will give us an empirical estimate of the probability that theta one is bigger than theta two. So here we have 500 theta 1 values, 500 theta 2 values, and then our indicator functions and you can see in almost all cases, theta 1 is greater than theta 2. There are some cases where theta one is less than theta two. So this number's probably going to be close to one, but it won't be exactly one. So we'll scroll down. Stick in a sum. Divide that by 500. And see we get an empirical estimate of 0.98. So we can say that we think the posterior probability theta 1 is greater than theta 2 is approximately 0.98. And some of the exercises later in this course, you'll be ask to work with other distributions besides the binomial and the beta. And so in Excel there are functions for gamma distributions and for normal distributions as well.