We'll start Module 2 now this gives specific weighting steps. I'll give you an overview on this first little video. The steps that we go through in probability sampling are the following, we usually have four steps and some surveys make this more elaborate, but these are the four basic ones. The first thing that we want to do is compute what are called base weights. Those are inverse selection probabilities. Those do apply to probability samples, as you'll see in a second page, they don't apply in the same way to non probability samples. Then we adjust base weights to count for any units that have got to unknown eligibility. We talked about this earlier in the previous course, but the idea there is there's some units that you may not be able to determine whether they're eligible or not because you can't contact them, for example, and we make an adjustment for that. Then we adjust for non-response. Some units don't cooperate so we try to represent them by making a weighting adjustment. Then finally, we calibrate population controls. Now, this is a flowchart of the steps that we go through to implement those four steps in the earlier slide. We start up here, the first step is compute base weights in Step 1 here. Those are going to be the inverse of the selection probabilities so we've got to keep track of those, and then we drop down to this diamond, which is a decision point and you ask yourself, do I have units with unknown eligibility? If I do, then I head down this channel and make an adjustment. If I don't, then I go down here, I skip that adjustment for unknown eligibility step and I go to the next step. Let's suppose we do have units with unknown eligibility. What we do is we adjust the weight of the ones whose eligibility is known as we can see here. As part of the operation, you need to have an audit trail. What do we do? We store the file of unknowns and we store file of ineligible if I find any of those and you keep track of those. That's important because if you have to go backwards and re-execute these steps, you need to have the data available to do that and you just don't want to remove the unknowns in the ineligibles and throw them away, that's bad procedure. We make our adjustment, we store the file of response non-response here in 2c and then we get another decision point. Do we have non-responding units? If we do, then we head off in this way and make an adjustment for non-response if we don't, we could skip that step and drop down to the next one. If we do have non-respondents, we're going to do is adjust the weights of the eligible respondents and we store the file non-respondents in 3a here, again, because we're trying to make an audit trail where we could backtrack if we need to. Then the output of this step is a stored file of respondents with adjusted weights for both non-response and unknown eligibility, if you had unknown eligibles. Then we come to another juncture here, we've got the possibility of using auxiliary data to adjust for coverage errors and the improve position to reduce variances. If we do have such data, then we head off down this track, if we don't, we go down this branch and store the file respondents and we're done. If we do have auxiliary data, then we use something called calibration estimation, which was step 4 in the previous slide and you need external control totals in order to do this. Somewhat subtle point is, if you're external controls include units that you would consider to be ineligible, then what you do is you need to include those IN's in your calibration estimation here, otherwise you'll be adjusting weights to the external controls that are too big for just the eligible units. That may or may not happen it depends on the source of your external controls. We do all way and finally, we store the file of respondents and that summarizes all the steps that we've got to go through to get this waiting process going. Now, non-probability samples are a different story. You can't do quite as many steps in those cases. For one thing, you don't have any base weights in the traditional probability sampling since because you don't have a probability sample to start with. You don't have selection probabilities to invert. You do need to identify ineligible units and get rid of those, those can still occur. There's no non-response in probability sampling sense, in the sense that you did a probabilistic selection of units and some of them did not respond and some of them did. What you have is a collection of data that you got some way in a non probability way, so you can't do quite the same non-response adjustment. On the other hand there are methods out there where you can compute things called pseudo-inclusion probabilities. It's the thing that's done in observational studies where you didn't have control over randomizing units and to say control and treatment. But you try to estimate Coy's, assigned probability for you so that that's what it's possible to do here. You could use a pseudo-inclusion probabilities using the inverse those and get a pseudo base way. The most important step probably in these non-probability samples is to calibrate to population control totals. The idea here is that you're really trying to make for real problems in the coverage of your non-probability sample. That's one of the big functions of it and it also can function that way in a probability sample. You get a subset of the steps in a non probability sample. We're going to gear our discussion mainly to probability samples. But keep in mind that some of the same thinking applies for these non-probability cases.