The goal of sampling is to estimate a value in the population as accurately as possible. But even if we use the most advanced sampling methods, there will always be some discrepancy between our sample value, the estimate, and the true value in the population. The difference between sample and population value is generally referred to as error. This error can be categorized into two general types, sampling error and non-sampling error. In this video, I'll only discuss the first type, sampling error. It's important to keep in mind that the true value in the population is almost always unknown. If we knew the population value, then we wouldn't need a sample. This also means that for any particular sample, we cannot assess how large the error is exactly. However, for sampling error, it is relatively easy to estimate how large the error is. Let's look at sampling error in more detail and see how it works. If we would take an infinite number of samples from a population, then under certain conditions, the average sample value of all these samples will correspond to the population value. But, of course, individual samples will result in sample values that are different from the population value. Sampling error is the difference between sample and population value that we would expect due to chance. We can estimate how large the sampling error is on average if we were to repeatedly draw new samples from the population. Note that this only works for randomly selected samples. The average error, called the standard error, can be estimated based on the values obtained in a single sample. We can then use the standard error to calculate a margin of error. You might think the margin of error tells us by how much our sample differs from the population at most. But we can't calculate between what boundaries the true population value lies exactly because we're estimating the sampling error in the long run over repeated samplings. In the long run, a ridiculously small or large value is always possible. What we can say is that the population value will lie between certain boundaries most of the time. This information is captured in a confidence interval. A confidence interval allows us to say that with repeated sampling, in a certain percentage of these samples, the true population value will differ from the sample value by no more than the margin of error. Suppose we want to estimate the proportion of people that will vote for candidate A in an election. We sample 100 eligible voters and find that 60% of the sample says they'll vote for A. We have to decide how confident we want to be. Let's say that with repeated sampling, we want the population value to fall within the margin of error at least 90% of the time. With this decision, we can now calculate the margin of error. Let's say that the margin of error is 8%. This means we can say that with repeated sampling, the population value will differ from the sample value by no more than 8% in 90% of all the samples. Sampling error is related to the sample size. As sample size increases, sampling error will become smaller. Sampling error is also influenced by the amount of variation in a population. If a population varies widely on the property of interest, then the sample value can also assume very different values. For a given sample size, sampling error will be larger in a population that shows more variation. Okay, so to summarize. Sampling error is the difference between population and sample value due to chance, due to the fact that our sample is a limited incomplete subset of the population. Sampling error is unsystematic, random error. It's comparable to the random error that makes a measurement instrument less reliable. We can estimate how large the sampling error will be in the long run, which allows us to conclude how accurate our sample value is likely to be. This only works under certain conditions. One of these conditions is that the sample is a random sample from the population.