In this video, we're going to define the binomial distribution, discuss its properties, and list conditions required for a random variable to follow a binomial distribution. We will also calculate probabilities under the binomial distribution using web applets, R, as well as doing hand calculations. Finally, we're going to evaluate characteristics of the binomial distribution, such as its mean and its standard deviation. We're going to frame our discussion, using an example from a classic psychology experiment. Conducted by Stanley Milgram, a Yale University psychologist, starting in the 1960s. These experiments measured willingness of participants to obey an authority figure who instructed them to perform acts that conflicted with their personal conscience. Here is the setup. The experimenter orders the teacher to give severe electric shocks to a learner each time the learner answers a question incorrectly. The teacher is the subject of the study and the learner is actually just an actor and the electric shocks are not real but pre-recorded sound is played each time the teacher administers an electric shock. So they actually think that they're really shocking these people. Milgram found that about 65% of people would obey authority and give such shocks. Over the years, additional research suggested this number is approximately consistent across communities and time. Each person in Milgram's experiment can be thought of as a trial. A person is labeled a success if she refuses to administer a severe shock, and failure if she administers such a shock. Since only 35% of people refused to administer such a shock the probability of success is p equals 0.35. Note that we're just defining success and failure here as we like because in the remainder of the analysis we're going to focus on people who refused to administer the shock. When an individual trial has only two possible outcomes like the one here, it is called a Bernouilli random variable. Suppose we randomly select four individuals to participate in this experiment. What is the probability that exactly one of them will refuse to administer the shock? Lets name our four individuals, Anthony, Brittany, Clara, and Dorian. We'll refer to them as A, B, C, D respectively. We're interested in one out of four people refusing to administer the shock and there are multiple scenarios where this can happen. So lets run through them. In scenario number one we have four people in our experiment. And let's say that the first person refuses to administer the shock, and the remainder of them all do not refuse to administer the shock. The probability associated with refusing is 0.35, and the probability associated with all the rest is 0.65. Since we're saying that the first person will refuse, and the second person will shock, and the third person will shock, and the fourth person will shock, when we'd say and, and these are remember independent trials because since these are a random sample of people, we multiply these probabilities. So the probability associated with this first scenario, first person refuses and everybody else shocks, is 0.0961. What other scenario can we think of? Well, again, we have these four people that are our placeholders. Let's say that the person shocks. The second person refuses. And the remainder of them shock. The probability of shocking for the first person is 0.65, the probability of refusing is 0.35, and the remainder is 0.65 as well. Once again, we want to multiply these probabilities and we yet arrive at the same probability for the overall scenario. So even though the order has changed, the overall probability has not changed because the order in which you multiply numbers does not change the product. Scenario three would be one where you have again four people. The first two people shock, the third person refuses, and the last one shocks. So the probabilities are 0.65 for the first two, 35 for the third and 65 for the last one. Multiplying these probabilities once again gets us to the same answer. Lastly, scenario number four we again have our four people. This time we're going to have the first three people shocking A B and C and D not, refusing to shock. The probability associated with shocking is 0.65 for the first three. And 0.35 for the last person. Once again we multiply the probabilities since these are independent outcomes, and we're looking for the joint probability, and the answer once again is 0.0961. So what's going on here? What we're saying is that, the possible scenarios could be scenario number 1 or scenario number 2 or scenario number 3 or scenario number 4. These are disjoint scenarios, disjoint outcomes. They can't all happen at the same time. Therefore when we say or, we add the probabilities, and therefore we find that the overall probability that exactly one person out of four refuses to administer the shock is 0.3844. We could have actually arrived at this answer as the probability of the first scenario or any scenario times the number of scenarios. So if we didn't have to go through the scenarios one by one for illustrative purposes, after we were done with the first calculation we could quickly try to figure out how many scenarios there are and simply multiply the probability of one scenario with the number of scenarios to arrive at the same answer. This is a perfect setting for the binomial distribution, as this distribution describes the probability of having exactly k successes in n independent Bernouilli trials with probability of success, p. We show that this probability can be calculated as the product of the number of scenarios times the probability of a single scenario. The probability of a single scenario is simply p to the k times 1 minus p to the n minus k. Let's decipher what this means. This means the probability of success to the power of number of successes, that was our k. Multiplied by the probability of failure, to the power of number of failures. To find the number of scenarios we actually enumerated each possible scenario, but this was only feasible since there were only four of them. And to be frank it was little tedious and boring as well. If there were many more, say we were looking for how many scenarios for four success in 100 trials, this method would be very tedious, and also very error prone. Therefore we usually use an alternative approach, namely the choose function which is useful for calculating the number of ways to choose k successes in n trials. To evaluate this function, we divide n factorial by, by k factorial times n minus k factorial. Let's give a couple examples here. Say you want to find how many scenarios yield one success in four trials. Here n is 4, k is 1; therefore, n choose k is 4 choose 1, which is 4 factorial divided by 1 factorial times 4 minus 1 factorial. Expanding these out we get 4 times 3 times 2 times 1 in the numerator and 1 factorial, so that's 1 times 3 factorial, 3 times 2 times 1 in the denominator. A little bit of simplification here. And we find that there are four possible scenarios. we already knew that from the earlier example anyway. Let's take a look at another example. How many scenarios yield two successes and nine trials? N is equal to 9, k is equal to 2. Let's take a look to see if we can enumerate these easily. Just like we did with the earlier example. The first two might be successes and the reminder failures. The first one might be a success. The second one a failure. The third one a success and the remainder failures. We could also have the first and the fourth one successes and the remainder are failures and this could go on and on. Obviously this is not the way to go, so we'll use the choose function. In this case 9 choose 2 is 9 factorial divided by 2 factorial times 7 factorial. And we can expand this out to 9 times 8 times 7 factorial, divided by 2 times 1 times 7 factorial. We purposefully didn't expand everything out, since the 7 factorials cancel easily, and we're left with 9 times 8 divided by 2. So that's 72 divided by 2, a total of 36 scenarios. These hand calculations are nice, but to speed things up we can also use computation. In R the associated function is also called choose and it takes two arguments n and k. So choose 9 comma 2. So that's n is 9, k is 2 actually yields the same 36 scenarios. Putting all of this together, if p represents probability of success, 1-p represents probability of failure, n represents number of independent trials, and k represents the number of successes, the probability of k successes in n trials can be thought of n choose k That's the number of scenarios times the probability of one scenario made up of p to the kth power times 1 minus p to the n minus kth power. And remember, n choose k is simply n factorial divided by k factorial times n minus k factorial. Now that we know how to apply these formulas and calculate binomial probabilities lets pause for a moment, step back and think about what does it take for a random variable to follow a binomial distribution. One, the trials must be independent. Two, the number of trials, n, must be fixed. Three, each trial outcome must be classified as either a success or a failure. And four, the probability of success, p, must be same for each trial. This last condition actually goes hand in hand with the first one, because if you have independent trials, then you can be reasonably certain, that the probability of success is going to be the same for each one. According to a 2013 Gallup pool, worldwide only 13% of employees are engaged at work. Engaged meaning psychologically committed to their jobs and likely to be making positive contributions to their organizations. Among a random sample of ten employees, what is the probability that eight of them are engaged at work? First, let's parse through the information that we're given. We're told that we have ten employees, so n is equal to ten, we're also told that 13% of them are engaged. So probability of success is 0.13 then the probability of failure will be the complement of this 0.87. This value's going to come in handy during our calculations. Lastly we're looking for eight successes. The probability that eight of them are engaged. So we set our number of successes k equal to eight. We can find this probability using the binomial distribution, because we actually meet the conditions required for the binomial distribution. We have a random sample of employees, therefore the independence condition for, or the independent trials condition is met. And since we have independent trials, the probability of success is going to be 0.13 for each employee. For each employee there are only two possible outcomes, either they're engaged or they're not engaged. And lastly, we have a fixed number of trials, n is equal to 10. Therefore, to find the probability of eight successes in ten trials, we would first calculate the number of scenarios using ten choose eight, and then multiply that by the probability of one scenario. Probability of success 0.13 to the 8th power, the number of successes, times the probability of failure 0.87 to the number of failures 2. Expanding out the choose function gives us ten factorial in the numerator and eight factorial times two factorial in the denominator. The rest of it is the same. Expanding this even further, in the numerator would get 10 times 9 times 8 factorial divided by 8 factorial in the denominator times 2 times 1. And again the rest of the equation is the same. The 8 factorials are going to cancel. Therefore, to calculate the number of scenarios, all we need to do is multiply 10 by 9. That's 90; divided by 2 gives us 45 different scenarios, each with a probability of 0.13 to the 8th times 0.87 to the 2nd power. The result is a pretty tiny probability, so if that's what you had guessed earlier, you're on the right track. Why is this a pretty low probability? Well, out of ten employees, we would expect so much fewer employees to be engaged than eight if the probability of success is only 13%. That's why what we're looking for here is a highly unlikely outcome, and highly unlikely means a very low probability. We can also calculate the same probability using R. And to use R here we would use the dbinom function, where the first argument is the number of successes. The second argument is our sample size or our number of trials, 10. And the third argument is the probability of success, 0.13. And once again we get to the very same tiny probability as our response. One other approach is to use our distribution calculator applet. So let's take a look to see how we can work with the binomial distribution on this applet. First, we select our distribution to be the binomial. We can choose our n and the default was actually good enough, n is equal to 10 for our scenario. And our probability of success is 0.13, so if we slide that down to 0.13. Remember that we're looking for eight successes, so we can slide our slider for the number of successes over to 8. However, we're not looking for a tail, we're actually looking for exactly eight successes. We can't even see the shaded area here anymore, because the probability of eight successes is very, very low, so it's almost invisible on our plot here. But we can see down here that the probability is calculated to be that very same low probability. Let's take a moment to look at the plot that we're seeing here. Each bar represents a possible outcome and the height of each bar in the plot represents the probability of that outcome. So for example if were looking for a different number of successes, say, what's the probability of getting exactly two successes in 10 trials. With the probability of success as 0.12 we would be looking at the height of the bar corresponding to a number of successes of 2. And that would be about 25%. Among a random sample of 100 employees, how many would you expect to be engaged at work? Remember, p is equal to 0.13. Easy enough, the expected number of engaged employees is going to be 100 times 0.13. So n times p, 13, or more formally, the expected value, or the mean of the binomial distribution, is simply equal to n times p. But this doesn't mean that in every random sample of 100 employees exactly 13 will be engaged at work. In some samples the number of engaged employees will be fewer, and in others, more. So much would we expect this value to vary? As usual, we can quantify the variability around the mean using the standard deviation. And for a binomial distribution, the standard deviation is defined as the square root of n times p times 1 minus p. And plug in the values from the original survey, we would expect the square root of 100 times 0.13 times 0.87, 3.36. This means that 13 out of 100 employees are expected to be at engaged at work, give or take approximately 3.36. Note that the mean on the standard deviation of a binomial, might not always be whole numbers, and that's alright. These values represent what we would expect to see on average.