[MUSIC] In this video we are going to look at the different types of probability sampling. Which you can use when drawing a sample from your sample frame. The first type of probability sampling is called simple random sampling. Here each population element has an equal chance of being selected into the sample. The second type of probability sampling is called stratified sampling. Here, the population is divided into strata, or groups. And the simple random sample is drawn from each of these groups or subset of the whole population. The third type of probability sampling is called cluster sampling. Here, however, the population is divided into clusters. And then clusters are drawn randomly. And after then only you can draw the units from each of the clusters. So if you think about it. The stratified sampling and the cluster sampling are very similar. Even though in the cluster sampling, there's a need to draw the clusters themselves randomly is important. Now, let's look at some elements of simple random sampling. How do you calculate the different measures of samples, like sample mean, sample variance, for cluster sampling? Let's think about this situation where we have N sample units. Let's say they are from x1, x2, x3, to xN. Now, how do you compute the sample mean? Sample mean is nothing but the average of these units that is sum over the x1 to xN and divided by N. How do you calculate the sample variance? Sample variance for a simple random sampling is just the spread around the mean. Finally, you have to calculate the variance of the mean for this sampling units. Remember when you calculate the variance, this variance is usually inversely related to the mean. And the reliability of the mean also determines the variance of the sample mean. Now, how much will the mean change over the new draws of the sample will also be determine by the variance of the sample mean? Usually, increases in the sampling variance usually increases the variance of the mean. Also the variance of the mean decreases with the sample size. So the bigger sample you draw, the better it is for your overall statistical analysis. So when you are trying to calculate the margin of error for your sample mean. You have to calculate the confidence interval of the mean. You usually consider certain degree of confidence like 95% let's say. So, basically what this means is, if you draw the sample of the same size over and over again. Then, in 95% of the time, the sample mean would be in this interval. So that is how you define a confidence interval of the sample mean. Again this kind of confidence interval is a measure of the reliability of the sample. Once you calculate this sample mean, or the confidence interval of the sample mean. You will need to check back to this standard normal distribution tables. And get the values from there in order to determine whether actually your sample is good or reliable or not. The second type of probability sample, which you are going to consider a little bit more detailed, is stratified random sampling. What is stratified random sampling? Usually, the target population and the sampling frame are divided into groups or strata. And then based on one of more characteristics, you draw the samples from there. So the strata can be defined based on age, gender, state, store size, so on and so forth. And then different samples are drawn from these different strata. Why do we even need to do stratification? Primarily because it's very difficult to draw from the whole sample, so you divide the population into different strata. The stratified random sampling can be done mainly in two ways. First of all, you have to think about proportionate samples. Here the sample sizes are usually proportional to the stratum sizes. However, the sample size in the stratum is the same percentage as in the population, in order to obtain representative samples. The next type of stratified random sampling is called disproportionate samples. Here, however, the sample sizes are not necessarily proportional to the stratum sizes. Overall, to obtain more reliable results and to guarantee that the minimum sample sizes are drawn to analyze the data carefully. It's better to don at the straighter level because that would make your inter sample correction procedure much more efficient. Because you are dividing up your entire population.. Finally, when you are trying to calculate the sample mean, its variance, or if you want to calculate the confidence interval of the sample mean. For stratified random sampling, you essentially use the same procedure which you used for simple random sampling. However, you need to rerate the sample units based on the stratum sizes. And then calculate all the statistical measures. That is the mean, the variance, both strata. Similarly you can do the mean and variance of the mean of the sample. And that's how you can get all the statistical measures. However, how to determine this optimal stratum sizes when you are trying to do stratified random sampling. Usually the strata size will depend on your total sample size, your stratum size. Also it depends on the variation of the observations within each stratum. Final element which you have to consider when doing a stratified random sampling is what do you do after the stratification? Usually in order to improve the representativeness of the sample after collecting the data. You probably need to reweight the sample elements which you draw from each data using the same weight as in traditional stratification. Second, you have to use variables to compute the weights observed in the data. The next type of probability sampling which you are going to consider is cluster sampling. Sometimes there's reason to draw groups or clusters and observe units within the sample cluster. This is a little bit different from stratified sampling. Because you are not necessarily dividing up your entire population into strata and drawing from every strata. Here the idea is that you, again, divide up your entire population into groups or clusters. But on these, select certain clusters which are important for your analysis. These clusters can be city blocks, counties, households, or firms. Why do we need to do this? Mainly because you don't want to look at the entire population or each and every strata. But you want to look at certain focused clusters which are important for your analysis. This will definitely reduce your cost of interviewing. But at the same time it's more convenient when calculating or when coming up with a sampling frame. Because if there's certain elements missing from your sample you can collect data from these clusters. So overall, these three types of probability sampling procedures are very important when you want to run a very formal statistical analysis based on your secondary data. In the next video we are going to consider other aspects of secondary data. Mainly the different data types. How do you run analysis on that data using different methods like one way tables, two way tables. Then correlation analysis, motivational analysis, as well as regressional analysis. Which are probably the most important aspects of a marketing research study. Thank you. [MUSIC]